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Abstract. In this paper, the space complexity of nonuniform quantum algorithms 
is investigated using the model of quantum branching programs (QBPs). In order to 
• clarify the relationship between QBPs and nonuniform quantum Turing machines, 

£3 | simulations between these two models are presented which allow to transfer up- 

per and lower bound results. Exploiting additional insights about the connection 
between the running time and the precision of amplitudes, it is shown that nonuni- 
O I form quantum Turing machines with algebraic amplitudes and QBPs with a suitable 

' analogous set of amplitudes are equivalent in computational power if both models 

■ work with bounded or unbounded error. Furthermore, quantum ordered binary 

, decision diagrams (QOBDDs) are considered, which are restricted QBPs that can 

be regarded as a nonuniform analog of one-way quantum finite automata. Upper 
and lower bounds are proved that allow a classification of the computational power 
of QOBDDs in comparison to usual deterministic and randomized variants of the 
model. Finally, an extension of QBPs is proposed where the performed unitary 
£f~^ • operation may depend on the result of a previous measurement. A simulation of 

. randomized BPs by this generalized QBP model as well as exponential lower bounds 

' for its ordered variant are presented. 

^h. 1. Introduction 

c ■ 

The intriguing open question behind the research on quantum computing is whether there are 
problems that can be solved more efficiently by quantum computers than by classical ones. 
Shor's famous quantum algorithm for factoring integers in polynomial time [33] provides the 
most conclusive evidence so far in favor of an affirmative answer of this question. The notion of a 
^ ■ quantum algorithm is made precise by models of computation such as quantum Turing machines 

(QTMs), quantum circuits, quantum finite automata (QFAs), and quantum communication 
protocols. For an introduction to these models, we refer to the textbooks of Gruska 
Kitaev, Shen, and Vyalyi JH]> and Nielsen and Chuang [2*rjj . 

Apart from the obviously important computation time, different other complexity measures for 
quantum algorithms have been investigated. Space is a crucial resource due to inherent techni- 
cal constraints in the current physical realizations of quantum computers. As pointed out by 
Ambainis and Freivalds [Zj , the goal of obtaining systems with a small quantum mechanical part 
was one of the motivations for considering quantum finite automata. In his seminal paper 
and its later extensions |4D|l41j. Watrous investigated the space complexity of quantum algo- 
rithms in the more general model of quantum Turing machines. The quantum Turing machines 
considered by Watrous may have algebraic transition amplitudes and are unidirectional, i.e., 
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the direction of the head movements is a function of the state entered in a computation step. 
Among other results, he has shown for this scenario that space O(s) probabilistic Turing ma- 
chines with unbounded error and quantum Turing machines with unbounded error are equivalent 
in computational power, where s is a space-constructible function. It is open whether similar 
statements hold for other types of error, e. g., bounded error. It is also not known whether the 
requirement of algebraic transition amplitudes is crucial for space-restricted quantum Turing 
machines, despite the results of Adleman, DeMarrais, and Huang [Sj that allow us to restrict the 
set of amplitudes to {0, ±3/5, ±4/5, ±1} for polynomial time, bounded error quantum Turing 
machines. Finally, even the standard assumption of unidirectionality remains to be justified for 
QTMs with sublinear space-bounds, since the known simulations for the time-bounded case due 
Bernstein and Vazirani and Yao [131 or Nishimura and Ozawa j2Zj can not be applied in an 
obvious way. 

Already classical Turing machines have turned out to be a quite cumbersome device for proving 
upper and lower bounds. Branching programs are a graphic representation of boolean functions 
and as such are more amenable to combinatorial arguments than Turing machines. Further- 
more, it is well-known that the logarithm of the size of branching programs is asymptotically 
equal to the space complexity for the nonuniform (advice taking) variant of Turing machines 
(Cobham J2j, Pudlak and Zak [HOD- Recently obtained lower bound results for branching pro- 
grams S323EK3 • which imply time-space tradeoffs for sequential computations, underline the 
significance of branching programs in the investigation of space complexity. 

In this paper we deal with a quantum variant of branching programs. In order to give a feeling 
of how quantum branching programs (QBPs) work, we consider the example in Figure ^ For 
the formal definition and the technical details we refer to Definitions 12.41 and 12.51 The QBP 
in the figure represents a boolean function depending on the variables x\ and xi- Each node 
v € V = {v\, . . . ,Vq} of the QBP is associated with a vector \v) of an orthonormal basis of 
the Hilbert space TC = C'^L Each intermediate state of the computation of the QBP is a 
vector in 7i. The initial state of the QBP is \vi), where v\ is the start node of the QBP. Each 
computation step consists of a first phase, where a projective measurement is used to decide 
whether the computation continues or whether it stops with the result or 1, and a second phase, 
where a unitary transformation described by the edge labels is applied to the state. If x% = 
(xi = 1), only the dashed (solid) edges leaving each Xj-node contribute to this transformation. 

In our example the projections describing the measurement are E cont = \vi){v\\ H h |t>4)(^4|, 

Eq = 1^5) (^5 1 , and E\ = l^e) {"^6 1 ? he., the projections on the subspaces spanned by the vectors 
corresponding to interior nodes and sinks labeled by and 1, resp. Assume that xi = x<i = 0. 
The initial state is \v\). The projective measurement yields that the computation is continued 
with probability 1. The dashed edges leaving v\ are labeled by l/\/2, hence, the next state is 
(l/v / 2)(|'U2) + 1^3})- in the second step the computation again continues with probability 1 and 
according to the labels of the edges leaving V2 and V3 the next state is \vq). Hence, in the third 
step the computation stops with probability 1 and the result is 1. 

The most important complexity measures for QBPs are the size of the QBP, i. e., its number of 
nodes, and the (expected or worst-case) computation time. QBPs may be cyclic or acyclic. For 
acyclic QBPs one can furthermore consider the width of the QBP, i.e., the maximum number 
of nodes with the same distance from the start node. Before we present our results on the 
relationship between the complexity measures for QBPs and other complexity measures for 
boolean functions, in particular the space complexity of quantum Turing machines, we discuss 
previous work on QBPs. 

Ablayev, Gainutdinova, and Karpinski and Nakanishi, Hamaguchi, and Kashiwabara |24"| 
have introduced quantum OBDDs (quantum ordered binary decision diagrams), i.e., acyclic 
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QBPs where the input variables may only be read once in a fixed order during each compu- 
tation. Ablayev, Gainutdinova, and Karpinski have presented a function that requires linear 
width in the input length for deterministic OBDDs, but only logarithmic width for quantum 
OBDDs. Nakanishi, Hamaguchi, and Kashiwabara have obtained a similar gap, but their lower 
bound even holds for randomized OBDDs. More recently, Ablayev, Moore, and Pollett [2] have 
proved that the class of functions that can be exactly computed by oblivious width-2 QBPs 
of polynomial size coincides with the class NC 1 , while width 5 is necessary classically unless 
NC 1 = ACC. Finally, Spalek (SZj has studied a general model of QBPs and has independently 
come up with a definition similar to that used here. Furthermore, he has also presented exact 
simulations between QBPs whose transition function is composed of unitary matrices from a 
finite basis and quantum Turing machines defined analogously. In the following, we describe 
the contributions of our paper. For the sake of a clearer presentation, we group the results into 
three parts. 

First Part: Simulations (Sections 2-5). In Sections 2 and 3 we define quantum branching pro- 
grams and extend the definition of quantum Turing machines (QTMs) to the nonuniform case. 
Following Watrous [3311011^ , we include unidirectionality as a part of our definition of QBPs 
and we usually consider unidirectional nonuniform QTMs. Simulations between QBPs and uni- 
directional nonuniform QTMs are presented in Section 4. Our first result shows that unidirec- 
tional nonuniform QTMs using space 0(log S) can be simulated by QBPs of size poly(S') taking 
the same number of computation steps as the simulated machine. In the opposite direction, 
we obtain an approximate simulation of QBPs of size S by unidirectional nonuniform QTMs 
that carry out T simulation steps with approximation error e in space poly(5 + loglog(T/e)) 
and time poly(5, T, log(l/e)). These results are for QBPs and QTMs whose amplitudes are 
arbitrary complex numbers. 

As remarked above, the standard set of transition amplitudes for QTMs in the space-bounded 
scenario are algebraic numbers. As an analogous standard set for QBPs we propose short 
amplitudes, i. e., amplitudes that can be represented in polynomial bit length in the size of the 
QBP as rational polynomials on finitely many algebraic numbers. Using our general simulation 
results and additional insights about the connection between running time and the precision 
of amplitudes, we show that in the case of bounded and unbounded error, QBPs with short 
amplitudes and size poly(S') and unidirectional nonuniform QTMs with algebraic amplitudes 
using space O(logS') are of the same computational power. 

In Section 5, we justify our standard assumption of unidirectionality for the considered models. 
We provide a space-efficient approximate simulation of (general) nonuniform QTMs by unidi- 
rectional ones. In particular, this result yields that 0(logS) space nonuniform QTMs, 0(log,S) 
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space unidirectional nonuniform QTMs, and poly(6 r ) size QBPs are of the same computational 
power if these models work with algebraic and short amplitudes, resp., and with bounded or un- 
bounded error. Altogether, these arguments show that QBPs are a suitable model for exploring 
space-bounded nonuniform quantum complexity. 

Second Part: QOBDDs (Section 6). We explore the relationship between the size of quantum 
OBDDs (QOBDDs) and classical OBDDs. First, we design polynomial size QOBDDs for a 
function that classical deterministic OBDDs can only represent in exponential size, as well as 
for a partially defined function for which even randomized OBDDs require exponential size. 
On the other hand, even very simple functions can be hard for QOBDDs. We show that for 
the disjointness function (x\ V 22) A (23 V 24) A • • • A (x n -i V x n ) as well as the inner product 
function 2122 © 2324 © • • • © 2 n _i2 n , QOBDDs require exponential size, while deterministic 
OBDDs can represent these functions in linear size. Finally, we prove that zero error QOBDDs 
of polynomial size are no more powerful than polynomial size reversible OBDDs. 

Third Part: QBPs with Generalized Measurements (Section 7). For quantum OBDDs as well 
as for quantum finite automata, the unitarity requirement of quantum algorithms is a serious 
restriction. Intuitively, the problem is that it is difficult in these models to forget input already 
read. In Section 7 we study the question of whether it may help to allow measurements to choose 
the unitary transformation for the next computation step (apart from checking whether the 
computation has stopped). For quantum circuits this question has already been considered by 
Aharonov, Kitaev and Nisan jl], who have proposed to describe the states and the computations 
of quantum circuits by mixed states and superoperators, resp. We define natural variants of 
QBPs and QOBDDs with generalized measurements and investigate some of their properties. 
QBPs and QOBDDs with generalized measurements can simulate their randomized counterpart 
without increase in size. On the other hand, we prove an exponential lower bound on the size 
of QOBDDs with generalized measurements for all so-called /c-stable functions. This class 
includes, e. g., the function checking for the presence of a clique in a graph and the determinant 
of a boolean matrix. 

2. Quantum Branching Programs 

In this section, we define classical and quantum variants of branching programs and discuss 
basic properties of the quantum variant. An extensive survey of results for classical branching 
programs is given in the monograph of Wegener [121 ■ 

Definition 2.1: A (deterministic) branching program (BP) on the variable set X = {x\, . . . ,x n } 
is a directed acyclic graph with a designated start node and two sinks. The sinks are labeled 
by the constants and 1, resp. Each interior node is labeled by a variable from X and has 
two outgoing edges carrying labels and 1, resp. This graph computes a boolean function / 
defined on X as follows. To compute /(a) for some input a = (a\, . . . ,a n ) G {0, l} n , start at 
the start node. For an interior node labeled by Xi, follow the edge labeled by aj (this is called 
testing the variable). Iterate this until a sink is reached, whose label gives the value f(a). For 
a fixed input a, the sequence of nodes visited in this way is called the computation path for 
a. The size \G\ of a branching program is the number of its nodes. Its width is the maximum 
number of nodes with the same distance from the start node. The branching program size of a 
function / is the minimum size of a branching program that computes it. 
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BPs are a nonuniform model of computation, so we usually consider a sequence (G n )neN °f 
BPs representing a sequence of boolean functions (f n )nebi> where G n represents the function 
f n : {0, l} n — > {0, 1}. We will encounter the following variants of BPs. 

Definition 2.2: 

A BP is called read-once if, for each variable Xi, each of the paths in the BP contains at 
most one node labeled by X{. 

A BP is called leveled if the set of its nodes can be partitioned into disjoint sets V\, . . . , Vg, 
where V% is called the ith level, such that for 1 < % < I — 1, each edge leaving a node in V% 
reaches a node in Vi + \. 

An OBDD (ordered binary decision diagram) is a read-once BP where on each computation 
path the variables are tested according to the same order. For the variable order ir it is also 
called tt-OBDD. 

Definition 2.3: A randomized BP is defined as a deterministic BP, but may additionally 
contain unlabeled randomized nodes with two unlabeled outgoing edges, may contain cycles, 
and may have sinks labeled by 0, 1, or "?". The computation for an input a is carried out 
by starting at the start node, following the outgoing edge labeled by ai for an Xj-node as for 
deterministic BPs, and taking one of the outgoing edges with probability 1/2 for randomized 
nodes until a sink is reached, where different randomized decisions are independent of each other. 
The probability that the randomized BP computes the output r G {0, 1, ?} for the input a is 
the probability that the computation for a reaches a sink labeled by r. 

Different modes of acceptance with unbounded, bounded (two-sided), one-sided, and zero error 
are defined as usual (see, e.g., [221112]). Randomized variants of the restricted models of BPs 
from Definition 12.21 are obtained by applying the respective restriction to the nodes labeled by 
variables. 

Next, we define a quantum variant of BPs. This definition contains the alternative definitions 
in the literature as special cases. 

Definition 2.4: A quantum branching program ( QBP) over the variable set X = {x\, . . . , x n } is 
a directed multigraph G = (V, E) with a start node s G V, a set F C V of sinks, and (transition) 
amplitudes 5: V x V x {0,1} ^ C Each node v G V — F is labeled by a variable Xi G X and 
we define var(t> ) = i. Each node v G F carries a label from {0, 1, ?}, denoted by label(t>). Each 
edge (v,w) G E is labeled by a boolean constant b G {0,1} and the amplitude 5{v,w,b). An 
edge with boolean label b is called b-edge for short. We assume that there is at most one edge 
carrying the same boolean label between a pair of nodes and set 5(v , w, b) =0 for all (v, w) E 
and b G {0,1}. 

The graph G is required to satisfy the following two constraints. First, it has to be well-formed, 
meaning that for each pair of nodes u, v G V — F and all assignments a = (a\, . . . ,a n ) to the 
variables in X, 

Ec*/ \ t-/ \ 1) if u = f ; and , TTr , 

5 (u,w,a vMu) )5(v,w,a VSll{v) ) = i Q otherwige (W) 

Second, G has to be unidirectional, which means that for each w G V , all nodes v G V such 
that 5(v, w,b) ^ for some b G {0, 1} are labeled by the same variable. 
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The well-formedness constraint implies that the QBP has a unitary time evolution operator 
(see below) and is, therefore, motivated by the laws of quantum theory. Unidirectionality is a 
property that makes understanding and manipulating models of quantum computation much 
easier. We discuss this issue in more detail in Section |31 Since unidirectionality is crucial for 
our simulations, we include this requirement in the definitions of QBPs. Next, we define the 
semantics of QBPs. 

Definition 2.5 (Computation of a QBP): Let G = (V,E) be a QBP on n variables with 
start node s G V, sinks F C V , and transition amplitudes 5. Let H = C' y ' and let (\v)) ve y be 
an orthonormal basis of H. Let a = (a\, . . . ,a n ) be an assignment to the variables of G. Let 
L(a) be the linear transformation from the subspace spanned by all \v), v G V — F, into 7i such 
that for v G V - F, 

L (a)\v) = J25(v,w,a vav{v) )\w). 

Due to the well-formedness constraint (jW|) . L{a) can be extended to a unitary transforma- 
tion U(a) on 7i. Call U{a) a time evolution operator of the QBP for input a. Define projection 
operators on TC by setting 

E cont = ^ HH, ^stop = })v)(v\, and E r = ^ \v)(v\, for r G {0, 1, ?}. 

v£V-F v£F veV, label(i;)=r 

For T G N and r G {0, 1, ?} define 

T 

p Gt r(a,T) = '^2\\E r (U(a)E cont ) t \s}\\ and pc,r(a) = p G , r (a,oo), 
t=o 

the probability that G outputs r for input a during the first T time steps and the (absolute) 
probability that G outputs r for input a, resp. 

QBPs computing a function /: {0, l} n — > {0,1} with unbounded error, bounded (two-sided) 
error, and one-sided error are defined in the straightforward way. We say that G computes / 
with zero error and failure probability e, < e < 1, if pq^ ^/( a ) ( a ) = and PG.?( a ) ^ e f° r a fi 
a G {0, l} n . We say that G computes f exactly if it computes / with zero error and failure 
probability 0. 

Let the (worst-case) running time of G on a be 

T G {a) = min{T | Tg N U {oo}, p G , (a, T)+p Gjl (a, T) +p Gj? (a, T) = l}. 

The running time can be in No, infinite, or undefined. The expected running time of G on a is 
defined by 

oo 

T G {a) = £V || 

Estop 

(U(a)E cont y\s)\\ . 

t=o 

We say that G runs in time T if T G (a) < T for all a G {0, l} n . Furthermore, G runs in expected 
time T if T G (a) < T for all a G {0, l} n . 

Since the QBP does not have edges leaving the sinks, the time evolution operator is merely an 
extension of the mapping L(a) and, therefore, not necessarily uniquely determined. 

In the remainder of this section we discuss the relationship between (classical) BPs and QBPs, 
and some variants of the definition of QBPs. Because of the well-formedness and the unidirec- 
tionality requirements of QBPs it is not obvious whether functions with small size BPs also have 
small size QBPs. In order to prove such a statement, we introduce the notion of reversibility. 
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Definition 2.6: A BP is reversible if each node is reachable from at most one node v by a 
0-edge and from at most one node w by a 1-edge and v and w are labeled by the same variable. 

Reversible BPs are obviously special QBPs. Furthermore, as proved by Spalek using a 
similar construction of Lange, McKenzie, and Tapp [23] for Turing machines, any (possibly 
non-reversible) BP of size s(n) = f2(n) can efficiently be simulated by a reversible one of 
size poly(s(n)). This implies: 

Proposition 2.7 ([37J): If the sequence of functions (/ n )neN h as BPs (G n ) n eN of size s(n) = 
O(n), it also has QBPs (G' n ) n£ ^ of size poly(s(n)). 

Adleman, DeMarrais, and Huang [2] have shown that uniform QTMs with arbitrary complex 
amplitudes can decide certain languages of arbitrarily high Turing degree in polynomial time 
and are thus too powerful to be realistic. For randomized classical as well as quantum models of 
computation, practical considerations (depending on the details of the physical implementation 
of the model) lead to restrictions on the set of allowed amplitudes. However it is not obvious 
what a natural restriction in the nonuniform, space-bounded scenario is. The following definition 
is motivated by the goal of finding the least restrictive definition that still allows the resulting 
QBPs to be simulated efficiently by the corresponding standard QTM model. Recall that an 
algebraic number (over is an x £ C such that there is a rational polynomial with root x. 

Definition 2.8: A sequence (GVi^ngN of QBPs has short amplitudes if for some number k 
independent from the input length there are algebraic numbers a±, . . . , a^, such that each 
amplitude of each G n can be written as p(cti, . . . , «&) for some /c-variate rational polynomial p 
of degree poly(|G n |) whose coefficients are fractions with numerator and denominator each of 
bit length at most poly(|G n |). 

The requirements of this definition are obviously satisfied in the special case that the sequence 
of QBPs uses only amplitudes from a fixed, finite set of algebraic numbers. This is the situation 
investigated for uniform, space-restricted QTMs by Watrous |4UU41) . Among other results, we 
show in Section 0] that unidirectional nonuniform QTMs with algebraic amplitudes and QBPs 
with short amplitudes are equivalent in computational power under space restrictions, which 
serves as a motivation for the above definition. 

We conclude the discussion on reasonable restrictions for the amplitudes with some simple ob- 
servations. First, QBPs with complex amplitudes can be transformed into equivalent QBPs 
with real amplitudes, where the number of nodes increases by a factor of at most 2 (cf. Propo- 
sition 5.3 in [IJ). The main idea is to replace each node v with two nodes v r and vi such that 
the corresponding vectors \v r ) and \vi) carry the real and imaginary part of the amplitude of 
\v), resp. Second, in Definition 12.81 the number k of algebraic numbers can be replaced with 
1, since by the primitive element theorem from algebra, the algebraic numbers ot\, . . . , can 
be represented as polynomials in a single algebraic number a. Since k as well as a±, . . . , 
are independent from the input size, these polynomials have a constant number of constant 
coefficients such that the resulting QBP still has short amplitudes. Finally, since the bit lengths 
of the denominators of all coefficients are bounded by poly(|G n |) and the numbers of edges and, 
therefore, the number of denominators is bounded by 2|G n | 2 , all the coefficients have a common 
denominator m of bit length poly(|G n |). We obtain the following result. 

Proposition 2.9: Each sequence {G n ) n ^ of QBPs with short amplitudes can be simulated by 
a sequence (G^) ne iH of QBPs with \G' n \ < 2\G n \ such that there is a single algebraic number 
a and a number m = 2 poly (l G ™D such that each amplitude of G' n can be written as p(a)/m for 
an integer polynomial p with a degree bounded by poly(|G^|) and coefficients bounded above in 
absolute value by 2 pol y(l G "D. 
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As for classical BPs, it is possible to simplify the structure of QBPs without increasing their 
size too much. The following has been observed by Spalek [A7\ . 

Proposition 2.10 ([37J): Let G be a QBP and let t G No- Then there is a leveled QBP G' 
with t + 1 levels that for each input a computes an output r G {0, 1, ?} with probability pG )r {a, t) 
after carrying out exactly t computation steps and that does not stop before. The size of G' is 
bounded above by (t + 1) 2 |G|. 

For the construction of QBPs, it is convenient to allow unlabeled nodes with an arbitrary number 
of outgoing edges carrying only amplitude labels. An unlabeled node v can be understood as an 
abbreviation for a node that is labeled by some input variable, where the value of this variable 
does not influence the computation. This means that each edge leading from the unlabeled 
node v to w has to be replaced with a 0-edge and a 1-edge from v to w which both have the 
same amplitude label as the original edge from v to w. When using unlabeled nodes we have to 
make sure that the QBP resulting from this transformation is unidirectional and well-formed. 

3. Definitions and Tools for Quantum Turing Machines 

We first introduce a nonuniform variant of quantum Turing machines (QTMs). The defini- 
tion is similar to those of Bernstein and Vazirani and Nishimura and Ozawa [22j for the 
uniform setting. Afterwards, we collect tools for approximately performing arbitrary unitary 
transformations by QTMs. 

Definition 3.1: A nonuniform (or advice-taking) quantum Turing machine is a QTM M = 
(Q, S, 5) together with an advice function adv: IN — ► £*, where Q is a finite set containing qo, qj 
and S = Si x • • • x Sfc with finite sets Si, . . . , £& each containing {0, 1, ?, B}. The QTM M 
has the initial state qo and the unique final state qf, and U B" is used as the blank symbol. The 
machine is equipped with three tapes, a read-only input tape, a read-only advice tape, and the 
work tape. All tapes are two-way infinite and indexed by Z and each is split into k separate tracks 
that may contain symbols from Si, . . . , We have 5: (Qx S 3 ) x (QxSx {— 1, 0, l} 3 ) — > C, and 
^((<Z> a 'n ff a, fw); (^.ffwi^ii^a^w)) is the amplitude for a transition from state q, with symbols 
<7i,cr a ,<T w on the input, advice, and work tape, resp., to state q', writing a' w on the work tape 
and moving the heads on the three tapes according to d\, d a , d w . Upon start of the machine, the 
input tape is loaded with the input string x G {0, 1}* at positions 0, . . . , \x\ — 1 of the first track. 
The advice tape is loaded with the advice string adv(|x|) G S* at positions 0, . . . , | adv(|x|)| — 1. 
All other tape positions contain blanks, all heads are at position and the finite control of M 
is in its initial state. A configuration of M is a tuple (q,w,i,j,k), with the current state of the 
finite control q G Q, the contents w G S* of the work tape, and the positions i,j, k G Z of the 
heads on the input, advice, and work tape, resp. Let C n {M) be the set of all configurations of M 
for inputs of length n. Let H = C' Cn( - A/ ^ be the Hilbert space spanned by all configurations from 
C n (M), which we identify with vectors from an orthonormal basis. The time evolution operator 
U(a) describes the application of the transition function 5 to a superposition of configurations, 
where the input is a. The well-formedness constraint requires U (a) to be unitary for all inputs 
a. 

Definition 3.2 (Computation of a nonuniform QTM): Let M = (Q,S, <5) be as in the 

above definition. A QTM indicates stopping by entering qf and signals its output by an entry 
at position 0, called the output cell, of a designated track of the work tape, called the output 
track. Define E staitc (A) as the projection operator over Ti onto the subspace spanned by all 
configurations with state in A C Q. Then the projections -Estate ({(?/})> E sta ,te{Q — {if}) describe 
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the measurement checking whether the current state is equal to q*. This measurement is 
performed before each computation step. If the QTM does not stop, U(a) is applied to the 
state after the measurement. Let E TCSU \ t (r), r G {0,1,?}, be the projection onto the subspace 
spanned by the configurations with result r in the output cell. If stopping of the QTM has 
been detected, the measurement described by these latter projections is carried out in order to 
determine the result of the computation. For T E No and r E {0, 1, ?}, let 

T 

PM,r(a,T) = '^2\\E TCSU i t (r)E staitc ({q f })(U(a)E state (Q - {g/}))*|s)|| 
t=o 

be the probability that M outputs r on input a during the first T computation steps. Based 
on these probabilities, acceptance of the QTM with different types of error is defined as usual. 
The (expected) running time of M on a, denoted by Tm(cl) (Tjv/(a)), is defined analogously to 
QBPs (Definition 12. 5j) . The space used by M on input a E {0,1}* is the maximum number 
of cells on the work tape between the leftmost and rightmost non-blank symbol taken over all 
configurations which are reached with nonzero amplitude during the computation on input a and 
in which the machine has not yet halted. The (total) space sm(o) used by M on input a £ {0, 1}* 
is defined as the sum of the space on the work tape and [log | adv(|a|)|~|. Finally, we say that 
M runs in space s : hi — > No if for all a G {0, l} n , sm(o) < s(n). 

Definition 3.3: A reversible Turing machine (RTM) is a deterministic TM where each config- 
uration has at most one predecessor. A TM or QTM M is called unidirectional if each state can 
be entered from only one direction on each tape, i.e., if there are functions D\,D & ,D W : Q — > 
{-1,0, 1} such that 8((q, a h cr a , cr w ), (q' ,a' w ,di,d a ,d w )) / only if A(</) = di, D a (q') = d a and 
D w (q') = d w . 

Unidirectionality is a crucial property of QTMs that makes working with them much easier. 
The property has first been investigated by Bernstein and Vazirani for single-tape QTMs 
that are additionally two-way, i.e., are required to move their head in each computation step. 
Their results include that single-tape RTMs (even with stationary tape heads allowed) are 
automatically unidirectional and, furthermore, that single-tape two-way QTMs can be simulated 
time and space efficiently by unidirectional ones. Furthermore, it is well known that also QTMs 
with stationary tape heads allowed can be time efficiently simulated by unidirectional ones using 
the simulations of QTMs by quantum circuits and vice versa due to Yao |3S1 and Nishimura 
and Ozawa These results cannot be applied in an obvious way in the space-bounded 

scenario. Already for TMs with only one additional input tape, reversibility does no longer 
imply unidirectionality, as simple examples show. In Section[5]we show that general nonuniform 
QTMs with sublinear space can be space efficiently simulated by unidirectional ones. 

For constructing unidirectional nonuniform QTMs, we need the usual toolbox of programming 
primitives that allows us to work with multiple tracks, combine TMs, construct looping TMs and 
so on. We use appropriate versions of lemmas for these tasks due to Bernstein and Vazirani 
We only remark that, by going through their proofs, it is straightforward to extend these lemmas 
to unidirectional RTMs and unidirectional QTMs, resp., with an arbitrary number of read-only 
input tapes. This includes nonuniform machines as a special case. 

In simulations of other models of quantum computation by QTMs, we face the problem of 
carrying out an arbitrary given unitary transformation over a finite-dimensional Hilbert space 
using only a finite program for the QTM. For doing this, we use a result due to Harrow, 
Recht, and Chuang ^1] that allows us to approximate any unitary operator over a finite-di- 
mensional Hilbert space by a product of "few" elements from a finite collection of "simple" 
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unitary transformations. The approximation is with respect to the operator norm, defined for 
an operator A over a Hilbert space Ti by ||A|| = sup{||^4x|| | x G Ti, \\x\\ < 1}. We say that A' 
is an e- approximation of A or approximates A with error e if \\A' — A\\ < e. 

Define the unitary matrices 

t/ 1 f 1 xr 1 f 1 2 \ A t/ 1 f 1 + 2 ^ =1 

Fl = 7fl2v-1 1 J' y2 = 7fU lJ' andy3 = 7!V l-2vQ 

For i G {1, 2, 3} let V- +3 = V^ _1 . Let £ 2 = {V u ■ ■ ■ , V 6 }. For i G {1, . . . , 6} and j G {1, . . . ,d — 1} 
define the unitary d x d-matrix Wjj by setting 



Wi,j\ 



{ (Vi)i,iU) + (Fi)2,ili + i)> if fc = i; 

(^)i >2 |i) + (K) 2> 2|i + 1), if fe=i + l; 
I A;), otherwise. 



Let £/d be the set of all Wij with i G {1, . . . , 6} and j G {1, . . . , d — 1}. Recall that SU(d) 
denotes the set of all unitary d x (i-matrices. Harrow, Recht, and Chuang ^1] have proved the 
following lemma, where we have added the estimate of the bound for k depending on d, while 
in ^3] the dimension is regarded as a constant. 

Lemma 3.4 (|14j): There is a constant c > such that for all e > 0, U G SU(d), and 
k = \cd 2 log(d/e)] , there are Ui,...,Uj. G Gd such that \\U — U\ ■ ■ ■ C4|| < e. 

Call the matrices Wi j with i G {1, . . . , 6} and j G {1, . . . , d — 1} elementary. Let d = 2 m 
and let \tp) G C d be encoded in m = \ogd qubits on the work tape of a QTM. Given i,j 
as additional inputs, we would like to compute Wij\tj}}, as required for the application of 
Lemma 13.41 Bernstein and Vazirani have shown how to implement this for a different 
set of two-dimensional transformations. By an easy adaptation of their construction and an 
application of the simulation of single-tape two-way QTMs by unidirectional ones also from 
their paper, we obtain: 

Lemma 3.5 ( 1 1 ): There is a unidirectional single-tape QTM M c \ cm with multiple tracks that 
works as follows. Let d = 2 m and let G C d be a superposition of m qubits. Let c(i,j) consist 
of the binary codes of i G {1,...,6} and j G {1, . . . ,d — 1}. Started with in tape cells 
0, . . . ,m — 1 of the first track and \c(i,j)) in the tape cells 0, . . . , \c(i, j)\ — 1 of the second track, 
-^eiem computes the output Wi t j\t/j) on the first track, replacing \ip), in time and space 0(m). 
Furthermore, the running time of M c i om only depends on m, the length of the contents on the 
first track. 

Combining Lemmas 13.41 and 13.51 we can use a QTM to compute a good approximation of any 
desired finite-dimensional unitary transformation. We still have to make sure that measuring 
the state after applying the approximate transformation gives a result that agrees with that 
after applying the original transformation with high probability. This can be shown using the 
following statements. The first one is due to Bernstein and Vazirani the proof of second 
one is analogous to that of a similar statement in [^f)], page 195. 

Proposition 3.6: Let U, U\, . . . , U n , and V%, . . . , V n be operators over a Hilbert space Ti. with 
\\Ui\\, \\Vi\\ < 1 and \\Ui — Vi\\ < e$ for i = 1, . . . , n. Then ||£7i • • • U n — Vi ■ ■ ■ V n \\ < £i H h en- 
Lemma 3.7: Let e > and t G N. Let U and U' be unitary operators over a Hilbert space Ti 
with \\U — U'\\ < e. Let P,Q be projections over Ti. Let \v) G Ti with \\\v)\\ = 1. Define 
p=\\Q(UP) t \v)\\ 2 andp' = \\Q{U'P) t \v}\\ 2 . Then \p - p' \ < 2te . 
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4. Equivalence of QBPs and Space-Bounded Unidirectional Nonuniform QTMs 

We prove our simulation results for QBPs and unidirectional nonuniform QTMs. We first 
provide a basic theorem that allows a step-by-step simulation of unidirectional nonuniform 
QTMs by QBPs and vice versa. Each step of a QBP can only be done approximately by a 
unidirectional nonuniform QTM. In order to control the total error, we have to specify the 
number of simulation steps in advance. This raises the problem of bounding the computation 
time of space-bounded algorithms that is studied afterwards. We first define a suitable notion 
of simulations. 

Definition 4.1: Let M\,M2 be nonuniform QTMs or QBPs. As defined in Sections El and |5J let 

PMi,r( a ,T) be the probability that M, computes the output r on the input a during the first T 
computation steps. We say that M\ simulates T steps 0/M2 in T' steps with accuracy e > 0, if 
for all a G {0,1}* and r G {0,1,?}: \pM 1)r (a,T') — PM 2 ,r( a i T)\ < e. We say that M\ simulates 
M2 if Mi simulates T steps of M2 in the same number of steps with accuracy e = for 
arbitrary T. 

4.1. Basic Step-by-Step Simulations 
Theorem 4.2: 

(i) Let M be a unidirectional nonuniform QTM that runs in space S{n) = fi(logn). Then 
there is a sequence of QBPs (G n ) ng iH with \G n \ = 2°( s ( n ^ that simulate M. 

(ii) Let (G ! n)nelisl be a sequence of QBPs with \G n \ = £l(n). Let e: IN — > (0,1) and T: IN — ► No - 
Then there is a unidirectional nonuniform QTM that for each n G IN simulates T(n) 
steps of G n in poly(|G n |, T(n), log(l/e(ra))) steps with accuracy e(n) and runs in space 
0(log|G n |+loglog(T(n)/e(n))). 

We discuss the consequences of this theorem for the motivation of our QBP model and the 
relationship between QBPs and QTMs in detail in Section f4. 21 

Proof of Theorem \4-2[ Part (i). This follows by an easy adaptation of the proof of the analogous 
result for classical BPs and TMs. Let M = (Q, S, 5) be a unidirectional nonuniform QTM with 
advice function adv: IN — ► S* that runs in space S(n) = O(logn). We ensure that the heads on 
the input and advice tape stay in the area consisting of the non-blank cells (see for details) . 
Then M has at most 2°^ s ^ n ^ configurations. 

We construct the QBP G over the variable set X = {xo, ■ ■ ■ ,x n _i} with C n (M) as its node 
set. For a configuration c G C n (M) of M where the head on the input tape is at position i G 
{0, . . . ,n — 1}, define var(c) = i in G (recall that var(c) denotes the index of the variable with 
which a QBP node is labeled). For an input with bit b G {0, 1} at position i on the input tape 
of M, let the application of the transition function 5 of M to |c) yield the superposition 

a(c,c' »|c'), a(c, c', b) G C. 

c'£C n (M) 

For each a(c,c',b) 7^ 0, we add a 6-edge from c to d in G and use a(c,c',b) as the amplitude 
label of this edge. We define the start node of G as the initial configuration of M and identify 
the set of final nodes F with the set of final configurations of M. 
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The graph G defined above fulfills the well-formedness requirement of QBPs since the time 
evolution operator of the QTM M is unitary. In order to prove that G is unidirectional assume 
for a contradiction that the node v has predecessors v\ and t>2 labeled by different variables. 
Then during the transitions of M that correspond to the transition of v\ to v and v 2 to v the 
head on the input tape makes different moves in contradiction to the unidirectionality of M. 
bmce \C n {M)\ = 2°^\ the branching program is of the required size. It is easy to verify that 
G simulates M because of the similarity of the definitions of the semantics for the two models. 

□ 

Proof of Theorem \J h t t\ Part (ii). Let G be the QBP to be simulated and let X = {xq, . . . , x n _i} 
be the variable set of G. In a first step, we show how to transform G into an equivalent QBP 
G' which has the additional property that all nodes that are reachable from the start node 
by a path of length t are labeled by xtmodn- This allows us to decompose the time evolution 
operator into n factors where each factor only depends on the value of one of the variables. In 
a second step we construct a nonuniform QTM and its advice string from the decomposed time 
evolution operator of G' and prove the claims on the resources required by this QTM. 

Let G = (V, E) and let s and F denote the start node and the set of sinks of G, resp. Due to the 
unidirectionality of G, all predecessors of a node v G V are labeled by the same variable, whose 
index is denoted by pre(v). If the start node does not have any predecessor, let pre(s) = n — 1. 
Furthermore, we set var(v) = for v £ F. 

We construct the QBP G' = (V',E') from G by adding dummy nodes. Let V = 
{(«,«) I v€ Vji6{pre(w) + l,... J n-l,0,...,var(u)}}. Let s' = (s,0) be the start node 
of G' and let F' = {(v,0) \ v G F} be its set of sinks. Define v&r(v,i) =i for all 
v G V and label (u, 0) = label(t> ) for all v G F. For each (v, w) G E, add an edge 
((«, var(i>)), (w, (pre(tu) + 1) mod n)) to E' that inherits all labels of the edge (v,w). Fur- 
thermore, for each w G V, i G {pre(w;) + 1, . . . , n — 1, 0, . . . , var(w) — 1}, and b G {0, 1}, add an 
edge ({w, i), (w, (i + 1) mod n)) to E' with boolean label b and amplitude 1. Let 5' be the tran- 
sition amplitudes of G' defined in this way. It is easy to see that G' is a QBP. Well-formedness 
and unidirectionality of G' follow from the respective properties of G for the subgraph induced 
by the nodes in {{v, var(u)), (v, (pre(u) + 1) mod n) \ v G V} and are obvious for the rest of the 
graph. It is easy to see that \G'\ = 0(n\G\). 

Claim. G' simulates T steps of G in nT steps with accuracy 0. Furthermore, there are unitary 
operators Ui(b) with < i < n — 1 and b G {0, 1} such that for any time evolution operator U'(a) 
of G' with a G {0, l} n , the projection E' cont to the space spanned by the non-sink nodes of G' , the 
start node s' of G 1 , and anyT G U , (U'(a)E' mnt ) nT \s') = ((C/ n _i(a n _i) C/ (a ))^ ont ) T |s') . 

Proof of the claim. For the proof that G' simulates T steps of G with nT of its own steps, 
let (p be the linear embedding of the superpositions of G into those of G' induced by setting 
f{\v}) = \(v,0)) for v G V. Let U(a) and U'(a) be time evolution operators of G and G', 
resp., for the input a G {0, l} n . Let E cont , E r and E' cont , E' r be the projections to the spaces 
spanned by the non-sink nodes and nodes with output label r, resp., for the graphs G and G', 
resp. An easy induction shows that for each T G tKlo 5 (U'{a)E f cont ) nT \s f ) = (p((U(a)E cont ) T \s)) . 
Furthermore, E' r ip(\v}) = <p(E r \v}) for all v G V. Hence, pQ' >r (a, nT) = pQ^{a,T) for all T G ftslo 
and G 1 simulates T steps of G with nT steps. 

Furthermore, it is also easy to prove by induction that for any T G No, i = T mod n, and 
an y VEV '-F' with var(«) + i, (v\E' cont (U'(a)E' cont ) T \s'} = (v\(U' (a)E> cont ) T \s>) = 0. Hence, 
instead of applying U'(a) in the (T + l)-st computation step, we may apply a unitary extension 
Ui(ai) of the mapping defined by \v) 1— > J2weV ^'( v > w > a i)\ w ) f° r v £ V with var(t>) = i, 
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without changing the computed superposition. Finally, for all and T mod n ^ 0, we have 

(v | (U'(a)E' cont ) T | a') = 0. By induction, it follows that for any T G N , (£/'(a)-Ew) nT | s '> = 



Now we describe the second step of the proof, the construction of the QTM from G' . Let 
s = 0{n\G\) be the number of nodes of G' . Let m = [logs]. It is convenient to assume that 
the node numbers have the length m + 2, where the numbers of interior nodes begin with 00 
and the numbers of 0- and 1-sinks with 01 and 11, resp. Furthermore, we assume that the start 
node has the number 0. 

Construction of the advice string. First, we define approximate representations for each matrix 
Ui(b), < i < n — 1 and b G {0, 1}, as a list of elementary matrices using Lemma 13.41 Choosing 
e' = e/(2nT 2 ) as the error bound and s as the dimension of the Hilbert space, Lemma 13.41 
yields s x s-matrices Ui t o(b), . . . , U^k-iQ)) whose product is an e'-approximation of Ui(b), where 
k = 0(s 2 log(s/e')) = 0(s 2 log(nsT/e)) is the number of matrices obtained from the lemma. 
Observe that the number of elementary matrices in the representation of Ui(b) is the same for all 
i and b. Elementary matrices are encoded such that the corresponding unitary transformations 
can be applied using the QTM provided in Lemma 13.51 The code for an elementary matrix 
Wjji consists of the binary codes of j G {1, . . . , 6} and j' G {1, . . . , s — 1}. 

On the advice tape, we store the codes of the elementary matrices Uig(b) for < i < n — 1, 
b G {0, 1}, and £ G {0, . . . , k — 1}, as well as some additional administrative information. The 
information is organized using four tracks, where the non-blank part of each track starts at 
position 0: 

Track 1: Binary code of the input length n. 
Track 2: Binary code of k. 

Track 3: Binary code of the length of the code for an elementary matrix. 
Track 4: List of codes for all Ui/(b). 

The length of the code of each elementary matrix is 0(log s). Each of the 2n matrices Ui(0) and 
Ui{l) is encoded using O(klogs) bits. We have k = 0(s 2 \og{nsT/e)). Hence, the length of the 
information on track 4 is bounded by 0(2n ■ /clogs) = poly(s, log(T/e)), which is also a bound 
on the overall length of the advice string. The logarithm of this, 0(log s + loglog(T/e)) = 
0(log |G| + log log(T/e)), is the contribution of the advice tape to the space. 

Construction of the QTM. The QTM uses the following tracks on the work tape: 

Track 1: Output track. The output of the QTM is in cell of this track upon termination. 

Track 2: Node register consisting of m + 2 cells that contains the current superposition of node 



Track 3: Buffer for the code of Ui ^{xj). 

Track 4: Counter i with values in {0, . . . , n — 1}. 

Track 5: Counter £ with values in {0, . . . , k — 1}. 

Track 6: Buffer for the value of the current input bit. 

Track 7: Buffer for the position of the currently applied Ui g(xi) on the advice tape. 

Initially, the work tape only contains blanks. By choosing an appropriate encoding of binary 
numbers (see, e. g., [SHDj we ensure that a string of blanks represents the number 0. Hence, the 
counters on track 4 and track 5 are initialized with 0. Since the start node has the number 0, 
the blanks from the initialization of the node register encode the start node. 





□ 



numbers of G' . 
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Repeat step 6; this erases track 7. 
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Repeat step 4; this erases track 6. 







Figure 2: Algorithm for the nonuniform QTM simulating G' . 



The algorithm performed by the QTM is shown in Figure |2J The algorithm consists of an 
infinite loop whose body, steps 2-11, simulates one computation step of the QBP G' . The loop 
is left and the algorithm terminates in step 2 if a sink has been reached. We only bother to 
simulate the first nT computation steps of G' and thus the first T computation steps of G with 
sufficient accuracy. In the following, we describe how this algorithm is implemented. 

We construct unidirectional RTMs for steps 2, 4, 6, and 7 with the following additional prop- 
erties. We ensure that these machines only use the space already allotted on the work tape, 
that the time can be bounded by 0(1) and 0(n) for step 2 and 4, resp., and by a polynomial 
in the length of the advice tape, i.e., poly(s, log(T/e)), for steps 6 and 7. For step 2, we addi- 
tionally take care that the running time only depends on the length of the node register, but 
not on the actual contents of the node register. It is not hard to construct these machines from 
scratch. Furthermore, Lemma 13.51 yields a unidirectional QTM for step 8 that has space and 
running time bounded by the length of the node register, i. e., O(logs) and whose running time 
is independent of the actual contents of the node register. 

For constructing the final QTM from these basic RTMs, we apply appropriate versions of the 
lemmas of Bernstein and Vazirani for dealing with unidirectional nonuniform RTMs and 
unidirectional nonuniform QTMs. The finite loops are realized as described by Watrous |3Sj- At 
the beginning of a loop, we check a starting/stopping condition for the loop and switch the state 
of being outside or inside the loop, resp., when this condition is met. For the loops beginning 
in step 3 and 5, we use counters modulo n and k, resp., and check as the starting/stopping 
condition whether the counter is equal to zero. 

Using these tools, we first combine the machines for the steps 4 and 6-11, implementing the 
loops in step 3 and 5 as described above, to get a QTM M^u for steps 3-11. The outermost, 
endless loop is then realized by modifying the RTM for step 2. We use a simple unidirectional 
RTM constructed from scratch that carries out the described termination check, enters two 
special states as placeholders depending on the value of cell 1 of the node register, and then 
restarts its computation. We insert M^n into the state for the value of cell 1 (non-sink) and 
replace the state for the value 1 (sink) with the final state qj of the whole QTM. This yields 
the desired QTM for simulating G' and thus G. 
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We note that a space-bounded RTM performing an infinite loop cannot carry out initialization 
steps before the loop. By our choice of the encoding of the contents of the tracks, we do not 
need such an initialization. Furthermore, we have ensured that the running time for the body of 
the outermost loop is the same for all possible classical inscriptions in the node register. Hence, 
even if the simulated QBP is in a superposition, step 2 is always reached simultaneously for all 
nodes in the superposition. 

Space and time requirements. The space on tracks 1-6 of the work tape is obviously bounded 
by 0(1), O(logs), O(logs), O(logn), O(logfc) = 0(logs + loglog(T/e)), and 0(1), resp. The 
space on track 7 is bounded above by the logarithm of the length of the advice string, which 
is 0(logs + loglog(T/e)) as computed above. Since this is also the contribution of the advice 
string to the space, the overall space complexity is of the same order. We can estimate the 
running time for simulating one computation step of G' (steps 2-11 of the algorithm) as follows. 
The running time of steps 4 and 11 is O(n). The running time of steps 6, 7, 9, and 10 is 
dominated by the length of the advice tape, which is of order poly(s, log(T/e)). Step 8 can be 
performed in time proportional to the length of the node register, i.e., O(logs). Hence, also 
the overall time for one computation step is of order poly(s, log(T/e)) = poly(|G|, log(T/e)). 

Correctness. Let us assume for a moment that the product U^k-\{xi) ■ ■ ■ Ui t o(xi) equals Ui(xi). 
Then it is easy to see that steps 4-10 exactly apply Ui{xi) and that steps 3-11 exactly apply 
U n -i(x n -i) ■ ■ ■ Uo(xq) to the node register. Together with the termination check in step 2 
which realizes the projection E' cont to the non-sink nodes of G', steps 2-11 exactly apply 
U n -i(x n -i) • • • Uo(xo)E' cont to the node register if the QTM does not stop. Due to the above 
claim, we know that this simulates n successive computation steps of G' and thus one compu- 
tation step of the original QBP G. 

However, the product Ui t f--i(xi) ■ ■ ■ C/j,o(^i) is merely an e'-approximation of Ui(xi), where 
e' = e/(2nT 2 ). By Proposition E2D we may estimate the error in the application of 
Uq(xo), . . . , U n -i(x n -\) by ne'. Let pG, r {<i-,t) be the probability that G halts after exactly t 
steps at a sink labeled by r £ {0,1,?}. Let pM,r(d,t) be the probability that M halts after 
exactly t iterations of steps 2-11 and outputs r. As remarked above, the error of one iteration 
of the outer loop is bounded by ne' . By Lemma ETTI \pQ jT {a,t) — pjvf )T .(a, t)| < 2te'n < e/T for 
alH = 0, . . . , T. Hence, 

XT T 
t=0 t=0 t=0 

Altogether, we have proved that M simulates T steps of G in poly(|G|, T, log(l/e)) steps with 
accuracy e. □ 



4.2. High-Level Simulation Theorems 

Here we use the basic, technical simulations from the last subsection for proving that the 
logarithm of the size of QBPs and the space complexity of QTMs asymptotically agree for 
the standard models of QBPs and QTMs. On the way, we investigate the relationship between 
precision and running time for QBPs. All proofs are given in Section POl We assume throughout 
this subsection that the logarithm of the size of the considered QBPs and the space complexity 
of the QTMs are at least logarithmic in the input length. 

We begin with a simple corollary from the basic simulations. If we want to apply the approxi- 
mate simulation of QBPs by QTMs, we have to specify a bound e on the simulation error and 
a bound T on the number of simulation steps in advance. These parameters turn up in a term 
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of 0(log log(T/e)) in the space complexity of the simulating machine. If we restrict ourselves to 
bounded error computation and to exponential running time, Theorem 14.21 immediately yields: 

Corollary 4.3: The logarithm of the size of QBPs and the space complexity of unidirectional 
nonuniform QTMs are asymptotically equal if both models are restricted to bounded error and 
exponential running time in the worst case. Furthermore, the classes of functions computable by 
sequences of QBPs with polynomial size and by unidirectional nonuniform QTMs with logarith- 
mic space are the same if both models are restricted to bounded error and polynomial running 
time. 

It is obviously practically motivated to work with bounded running time, but it is not clear 
what kind of bounds can be chosen without restricting the computational power of the space- 
bounded models considered here. In [3H1 an d implicitly also in Watrous has investigated 
this question for unidirectional uniform QTMs and has obtained answers analogous to the 
situation for probabilistic TMs. He has shown that unidirectional uniform QTMs with rational 
amplitudes and running in space S(n) = O(logn) have an expected running time that is at 
most doubly exponential in S(n). This result can be extended to unidirectional uniform QTMs 
with algebraic amplitudes using the ideas from his later papers |4f)(l41j. 

These considerations provide the motivation to look at the relationship between the precision 
allowed for the amplitudes and the running time also for the nonuniform model of QBPs. In 
turns out that short amplitudes take over a role analogous to algebraic amplitudes for QTMs. 

Theorem 4.4: 

(i) Sequences of QBPs (Gn)neN with bounded error and short amplitudes and sequences of 
QBPs {G' n ) n ^ with bounded error and expected running time 2 poly ^ G ' n ^ have polynomially 
related size complexities. 

(ii) Sequences of QBPs {G n ) n ^ with unbounded error and short amplitudes can be simulated by 
sequences of QBPs (G' n ) n ^ of size poly(|G n |) and with expected running time 2 poly (l G ™D. 

Our final and main result of this subsection provides a justification to regard QBPs with short 
amplitudes as the natural standard variant of the model analogous to QTMs with algebraic 
amplitudes. 

Theorem 4.5: The logarithm of the size of QBPs with bounded or unbounded error and short 
amplitudes and the space complexity of unidirectional nonuniform QTMs with algebraic ampli- 
tudes and the same type of error are asymptotically equal. 

4.3. Proofs of Theorems 14.41 and 14.51 

For the proofs of the theorems we need a couple of technical lemmas, which are concerned with 
the analysis of a matrix series that describes the acceptance probability of a QBP. Using these 
lemmas we provide two results on QBPs with short amplitudes, which are the basic tools for 
proving Theorems 14.41 and 14.51 First, even in the case of unbounded error there is some gap 
between the error probability and 1/2. Second, in QBPs with short amplitudes a probabilistic 
clock can be added by which computations lasting too long are aborted. 

For the following, consider an arbitrary QBP G with s nodes. For any fixed input a for G let 
U = U(a) be a unitary time evolution matrix of G. Recall that E cont is the projection operator in 
the measurement of the output label which belongs to the result "no label." Let D = U E cont and 
M = D ® D, where D denotes the matrix obtained from D by taking the complex conjugate 
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of each of its entries. Let N = s 2 denote the dimension of M and let |l),...,|iV) be the 
standard basis of C . For v G {1, . . . , s}, define i v = v + s(v — 1) G {1, ... , N}. Then, for any 

v,w G {1, . . . , a}, M iw;iv = (H <g) (w\)M(\v) <8> \v)). 

Lemma 4.6: 

(i) The probability that the node w is reached after exactly k computation steps in G when 
starting at the node v is equal to {M k )i w ^ v . 

(ii) The absolute value of each eigenvalue of M is bounded above by 1. 

Proof. Part (i) follows from (M k ) iwtiv = ((w\ <g) {w\)(p k ® D k )(\v) <g) \v)) = (D k ) w , v ■ (D k ) w>v = 
w ^ v \ , which is obviously the desired probability. 

For part (ii) it suffices to prove that ||M|| < 1, since ||M|| provides an upper bound 
on the absolute value of the eigenvalues of M (see, e.g., (TSj, page 45). We have 
M^M = (D<g>D) j i(D®D) = ((D)W) <g> (£>+£>). Furthermore, D^D = (U E cont )\U E cont ) = 
El ont E cont = E cont . The eigenvalues of D^D are thus from {0, 1}, and the same holds for (D)^D. 
Since the eigenvalues of M^M are obtained as products of the eigenvalues of (Dy D and D, 
it follows that ||M|| < 1. □ 

The above lemma yields that, for each pair of nodes (v,w) in G, limfc^ 00 (V« =n M e ) . . is the 
probability of reaching node w from node v in G. In particular, the acceptance probability of G 
can expressed as the sum of all such terms where v is the start node and w is a 1-sink. 

We use the technique of Watrous [22111131^ to analyze the series (S£o-^)i i ■ Since the 
matrix series M does not converge in general, we look at the series ^2^o(zMY for 

some z G [0, 1) instead and let z tend to 1 afterwards. Using the restrictions on the involved 
numbers, we then show two facts: First, lim z ji (^2^Lq(zMY^ . . can be approximated with 

sufficient precision by choosing z = 1 — 2~ poly ( N \ Second, if the limit (Y^^n M £ ) . . is not 

\ — ^ u 'twill) 

exactly 1/2, then it can be bounded away from 1/2 by a gap of size at least 2~ poly ( N \ 

For a multivariate polynomial /, the height of f, denoted by ||/||, is the maximum absolute 
value of any of its coefficients and deg(/) is the maximum degree of / with respect to any of its 
variables. Using the form of the entries of U = U(a) obtained by Proposition ^. 91 it is easy to see 
that there is a real algebraic number a not depending on N and a number m = 2 poly ( Ar ) such that 
each entry of M = UE coni ® UE cont can be written as p(a)/m for an integer polynomial p with 
deg(p) = poly(iV) and ||p|| = 2 poly ( N \ The following three technical lemmas yield properties of 
general matrices of this form (not necessarily derived from QBPs). The first two lemmas are 
extracted from (Lemma 4.6 and its proof and the beginning of the proof of Lemma 4.2, 
resp.). 

Lemma 4.7 f |41p: Let a be any real algebraic number. 

(i) If f is a univariate polynomial with \\f\\ < 2 d , deg(/) < d and f(a) / 0, then 
|/(a)|>2-°( rf2 ). 

(ii) Let f, g be bivariate integer polynomials with ||/||,||<7|| < 2 d , deg(/), deg(g) < d and 
g(a, 1) 7^ 0. Then there is a constant c > such that for any 5 with < 5 < 2~ cd2 
and d sufficiently large, 

f(a,l) f(a,l-6) 
g(a,l) g(a,l-S) 



< 52 a 
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Lemma 4.8 ([4Q): Let a be any real algebraic number and let m G R. Let M be an N x N- 
matrix such that for each entry x there is an integer polynomial p with x = p{a)/m and deg(p) = 
poly(iV), |b|| = 2P°M Ar ). Further suppose that the eigenvalues of M are bounded above in 
absolute value by 1. Let 1 < i,j < N and let S = (^^LqM ) 4 . be convergent. For z G [0,1), 

define S(z) = (X/^LoO 2 -^) )j . Then there are bivariate integer polynomials f,g such that 
|, HsH < m N 2P ol yW, deg(/j,deg( 5 ) = poly(iV), g(a, 1) ± 0, and 



f(a,z)/g(a,z) 
f(a,l)/g(a,l) 



S(z), for z G [0, 1), and 
S. 



Lemma 4.9: Let m = 2 po ^^ . Let M be an N x N -matrix as in the previous lemma. Let 
Si, = (Zr= Q M e ) tl forl<i,j<N. 



hi 



(i) Suppose that Sij converges. For z G [0,1), let Sij(z) = (X^o^-^OO i j- Then there 
polynomial p such that for any z = 1 — 5 with < 5 < 2~ p ( N \ \Si j — Sij(z)\ < 52 p ( N \ 

(ii) Let I C {l,...,iV} 2 and suppose that for each (i,j) G I, Sij converges. Let S 

^hi m Then there is a polynomial p such that S ^ 1/2 implies \S — 1/2| > 2~ p ( N \ 

Proof. Part (i): Use Lemma 14.81 to get bivariate integer polynomials fij,gij such that 



is a 



fi,j(a,z)/gij(a,z) 
fij(a, l)/g id (a, 1) 



Sij(z), for z G [0, 1), and 



5, 



By the lemma and the fact m = 2 poly ( JV ), there is a polynomial q such that \\9i,j\\ < 2 q( - N ^ 

and deg(/jj), deg((7ij) < q(N) and, furthermore, gij(a, 1) / 0. By Lemma l4~7T ii) applied to fij 
and gi j with d = q(N), it follows that there is a constant c > such that for all < 5 < 2~ cq ( N ^ 
and N sufficiently large, 



15. 



'■j 



Su(l-8)\ 



9i,j{<*A) 9i,j(a,l-$) 

Choosing p(N) = cq(N) 2 yields the desired bound for any z = 1 — 5 with < 5 < 2~ p ( N '. 
Part (ii): By Lemma 14.81 it follows that for each G 7, 



J h3 



'■J 



9i,j(oi, 1) 



< 2«W and 



where fij and gij are bivariate integer polynomials with H/yll, \\gi j 
deg(fij),deg(gij) < q(N) for some polynomial q, and gij(a, 1) / for all i,j G 7. Then 

s = E ^4?iy /1/2 * 2 E II ^'(«,i) - II 



(M')e/ 



(ij)6/ 



The left hand side of the last inequality is a polynomial in a with height at most 2°d J l' 9 W) = 
2 poiy(A0 and de 

gree at most |7| • o/(iV) = pofy(iV), since |7| < iV 2 . Lemma I4.7f i) implies that 
the absolute value of this expression is lower bounded by 2~ q '( N ^ for a suitable polynomial q 1 
and N large enough. Hence, 



(i,i)G/ 



> 
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We have \\gij\\ < 2^ N \ deg(g isj ) < q(N), and a, a 2 , . . .,a q ^ = 2 poly ( iV ) since a is a constant. 
This implies that \gij(a, 1)| < 2 q "( N ^ for a polynomial q" and N sufficiently large. Thus, 

|5-1/2| > ^r^wr > 2 " P(7V) 
for p(JV) = g'(iV) + N 2 q"(N) + 1, which proves the claim. □ 

Now we can state and prove our first main lemma that allows us to bound the error probability 
of QBPs away from 1/2. 

Lemma 4.10: For each QBP G with short amplitudes there exists a polynomial q such that for 
each input a G {0, l} n , PG,i( a ) > 1/2 implies pc ,i(a) > l/2 + 2~ q ^ G ^ and pG,i(a) < 1/2 implies 
p G ,i(o) < l/2-2-"(l G D. 



Proof. Let G be a QBP with short amplitudes on n variables. By Proposition 12.91 we may 
assume that the amplitudes in G are of the form p(a)/m, where p is an integer polynomial 
with deg(p) = poly(|G|) and ||jp|| = 2 poly ^ G ^ and where a is the same algebraic number and 
m = 

2P° 1 y(|G|) is the same natural number for all amplitudes. Let v be the start node of G and 
let F\ = {w | w is a 1-sink of G }. Let N = \G\ 2 and let the N x TV-matrix M describing the 
computation of G on an input a G {0, l} n as well as the indices i v G {1, . . . , N} corresponding 
to nodes v G {1, . . . , |G|} be defined as above. Then the probability of G accepting a in the 
kth computation step is given by J2 w ^p 1 {M k )i wi i v , and the total probability of accepting a 
is PG,i{a) = YlweFx {J2T=o ^ k )i i ■ Since G only contains labels of the form p(a)/m, the 
entries of M are of the form p'(a)/m' , where p' is a polynomial with deg(p') = poly(|G|) and 
llp'll = 2 poly (l G D and m! = m 2 = 2 poly(|G|) . Hence, part (ii) of Lemma Ol yields the claimed 
result. □ 

The other main argument in our proofs is the construction of a probabilistic clock, which works 
in the case of bounded as well as unbounded error. 

Lemma 4.11: For each sequence of QBPs (G n ) n ^ with bounded or unbounded error and short 
amplitudes, there is a sequence of QBPs (G^) ne iH for the same function with short amplitudes, 
the same type of error, size poly(|G n |), and expected running time 2 poly (l G ™D. 

Proof. The main idea is similar to that of Simon [3B1 for limiting the running time of probabilistic 
Turing machines. We simulate G step-by-step. Before each simulation step, we stop and reject 
the input with fixed, small probability. A similar construction for unidirectional QTMs has 
been given in Lemma 4.6 of Watrous |39| . 

Let G be a QBP on n variables of size s. By Proposition 12.91 we may assume that the 
amplitudes of G are the fraction of some integer polynomial in an algebraic number and a 
common denominator m = 2 poly ( s ). Let q be some polynomial. We construct a QBP G' 
with size polynomial in s, expected running time 

2 poi y (s) 5 and guch that for all a G { ,l} n , 

PG,i( a ) ~ 2~ q ( s ^ < PG',i( a ) < PG,i( a )- Together with Lemma 14.101 this implies the claim. 

Let t = t(s) = q(s) + p(s) + logs, where p(s) is a polynomial defined later on. Let vi,. . . ,v s 
be the nodes of G. The new QBP G' is obtained from the QBP G' shown schematically in 
Figure [3 We use unlabeled nodes introduced in Section El to simplify the presentation. The 
start node of G' is w\. The edges in the upper part of the figure represent the transformation 
i— > +7|u>*), where 

2 2t+i + 2 *+i 2 t+1 + 1 

(3 = 9 . —rn 7 an d 7 



2 2t+l + 2*4-1 +1 I 2 2t + 1 + 2*+! + 1 ' 
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Figure 3: The QBP G' used in the proof of Lemma 14.111 



Then (ft 2 + 7 2 = 1, which is used to prove that the QBP is well- formed. Each node 
i G {1, . . . , s}, is a copy of the node Vi in G and is labeled by the same variable as V{. For 
each edge (vi,Vj) in G, an edge (w^w") is inserted in G' that carries the same labels. The 
shaded part in the figure represents these edges. The node w'( is a sink if the corresponding 
node Vi in G is, and each non-sink node w'( is unlabeled and has an outgoing edge with ampli- 
tude 1 to node Wi (not shown in the figure). The only nodes labeled by variables are w[, . . . , w' s , 
all other nodes are unlabeled. We remove all unlabeled nodes from G' to obtain the desired 
QBP G'. It is easy to see that G' constructed in this way is well- formed and unidirectional. 
The only numbers added as amplitudes here, 1, (5, and 7, are rational and have representations 
of polynomial length. Hence, G' also has short amplitudes. 

We observe that the probability of G' terminating during a traversal of the upper part is 
8 = I7I 2 < 2~ 2t . Hence, its expected running time is bounded by 2°w = 2 poly ( s \ Furthermore, 
for all inputs a E {0, l} n , PG',i( a ) < PG,i(a)- It remains to show that for all inputs a, Pg',i(°) > 

Fix any input a E {0, l} n . Let N = s 2 , let the N x iV-matrix M describing the compu- 
tation of the original QBP G on a, and let the mapping of nodes v E {l,...,s} to in- 
dices i v E {l,...,iV} be defined as above. Let v be the start node of G and let F\ = 
{w I w is a 1-sink of G }. As in the proof of Lemma |4.1fl( the total probability of G accept- 
ing a is pc,i{a) = X^wgFi (Sfc^=o M k ) { ■ ■ Now recall that G 1 performs the same computation 
as G with the only exception that it terminates the computation with the probability S before 
each step of G. Hence, the probability of G' accepting a in the fcth simulation step of G after 
not rejecting k times in the first phase of the computation is ^ pf! ((1 — 5) k M k ) . . . We 
obtain 

00 

PG'M = ^(^(l-<5) fe M' 

Now choose p as the polynomial obtained when Lemma I4.9f i) is applied with z = 1 — 5, 
= Pc?,i( a )) and Sij(z) = PG',i( a )- The lemma implies that 



\PG',i{a) -PG,i(a)\ < Yl 



^(1 -5) k M h 



k=0 



fc=0 



< |Fi| -5-2 p( - s \ 



provided that < 5 < 2 p ( s \ The restriction on 5 is easily seen to be satisfied since 5 < 2 21 
and t = t(s) = p(s) + q(s) + log s. Using that < s, we obtain 

\Fi\ ■ 5 ■ 2 p( - s ^ < \Fi\ ■ 2~ 2 ( 9<s ) +p W +log,s ) • 2 P ^ < 2~ q ^ 
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and thus \pG',i{ a ) — PG,l(a)\ — 2 q ^ . Hence, G' has all required properties. □ 

Now we have collected all tools for the proofs of Theorems 14.41 and 14.51 For the convenience of 
the reader, we restate the theorems here. We begin with the proof of Theorem 14.51 

Theorem 4.5 (restatement): The logarithm of the size of QBPs with bounded or unbounded 
error and short amplitudes and the space complexity of unidirectional nonuniform QTMs with 
algebraic amplitudes and the same type of error are asymptotically equal. 

Proof. A simulation of unidirectional nonuniform QTMs by QBPs is already provided in Theo- 
rem 14.21 It is easy to see that the resulting QBP has short amplitudes if the amplitudes of the 
QTM are algebraic numbers. 

Now let a sequence (G n ) ng iH of QBPs with short amplitudes be given. By Lemma 14.111 we can 
simulate G ri by a QBP G' n with size poly(|G n |), the same type of error, short amplitudes and 
expected running time T(n) = 2^ oX ^ G '^ . In the case of bounded error, let e be the error bound 
of G' n . In the case of unbounded error, by Lemma 14.101 there is some polynomial q(n) such 
that the acceptance and rejection probabilities of G' n are strictly larger than 1/2 + 2~ q ^ Gn ^ or 
strictly smaller than 1/2 — 2~ q " G ' n ", resp. In this case let e = e{n) = 1/2 — 2~ q ^ G ' n ^ be the 
error bound of G' n . We choose e' = (1/2 - e)/3 and T'(n) = T(n)/e' = 2 poly (l G »D. Then we 
apply the simulation of QBPs by QTMs from Theorem 14.21 for the accuracy e' and the running 
time T'{n). The space complexity of the QTM is 0{\og\G' n \ + \oglog(T'(n)/e')) = 0(log \G n \). 
By Markov's inequality, the probability that the running time of G' n and thus the number of 
performed simulation steps exceeds T'(n) = T(n)/s' is bounded by e'. Hence, the probability of 
an error caused by running more than T'(n) simulation steps is bounded by e' and the overall 
error probability is bounded by e + 2e' = 1/2 — e' . □ 

Theorem 4.4 (restatement): 

(i) Sequences of QBPs (G n ) n ^ with bounded error and short amplitudes and sequences of 
QBPs (G^) ne y with bounded error and expected running time 2 poly ^ Gn ^ have polynomially 
related size complexities. 

(ii) Sequences of QBPs (G n ) n ^ with unbounded error and short amplitudes can be simulated by 
sequences of QBPs (G' n ) n ^ of size poly(|G n |) and with expected running time 2 poly ^ G ' n ^ . 

Proof. A simulation of QBPs (G n )neN with short amplitudes by QBPs (G' n ) n ^ with expected 
running time 2 poly (l G ™l) for bounded and unbounded error is contained in Lemma 14.111 This 
proves one direction of part (i) as well as part (ii) . It remains to prove the missing direction of 
part (i), i. e., to provide a simulation of QBPs with bounded error and an expected exponential 
running time by QBPs with bounded error and short amplitudes. Let (G n )neN be a sequence of 
QBPs with expected running time 

2P° 1 y(|G'n|) and 

error probability e £ [0, 1/2). As in the proof 
of Theorem ESI we choose e' = (1/2 - e)/3 and T'(n) = T(n)/e' = 2 pol ^ s( - n ^ and apply the 
simulation of QBPs by QTMs of Theorem 14.21 for the accuracy e' and the running time T'{n). 
By the same arguments as in the proof of Theorem 14.51 we obtain a unidirectional nonuniform 
QTM simulating the given QBP with bounded error, expected running time T(n), and space 
complexity 0(log|G n |). The transition function of the QTM only contains a constant number 
of algebraic numbers. 

In a second step we apply the simulation of unidirectional nonuniform QTMs by QBPs from 
Theorem 14.21 The resulting QBP has an error probability of at most e' . Its size is bounded 
above by 2°^\ G ^ = poly(|G„|). The amplitudes occurring in the QBP are the amplitudes of 
the transition function of the QTM and thus are short. □ 
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5. Simulation of Nonuniform QTMs by Unidirectional Nonuniform QTMs 

In this section, we consider nonuniform RTMs and QTMs that are, different from the previous 
sections, not necessarily unidirectional. We show that they can be simulated space-efficiently 
by their unidirectional counterparts. We discuss some consequences of the simulation result at 
the end of this section. 

Our simulation result uses the construction of the universal QTM due to Yao [13] and Nishimura 
and Ozawa [27] based on a simulation of QTMs by quantum circuits and vice versa as interme- 
diate steps. The original simulations cannot be applied since they use markers on the work tape 
of the simulating machine to store the positions of the simulated tape heads and (which is more 
serious) generate a quantum circuit for the simulated machine online on the work tape. Both of 
this is too costly in terms of space. These obstacles are overcome here by using a space-efficient 
encoding of the positions of the input tape heads and by storing a representation of the required 
quantum circuit on the advice tape. 

As a preparation for the proof of our simulation result, we state a simple necessary property 
of the transition function of QTMs with two read-only input tapes which is extracted from the 
proof of Theorem 4.5 in In the following the expression [A = B] has the value 1, if A = B, 
and otherwise. 

Lemma 5.1 (|28j): Let M = (Q,T,,5) be a QTM with two read-only input tapes. Letp,p' £ Q, 
A = (Ai, A2) £ Z 2 and a±, a 2 , a' 1; a' 2 , v , w, v' , w' G S. 

(i) = ^2 6 (P' ( a i, a 2, v),q,w, (d,d" - 1))* 

q eQ,d"e{o,i}, ■ 5(p',(a' 1 ,a 2 ,v'),q,w',(d',d"))-[d , -d = A]. 
rf,d'e{~i,o,i} 2 

(ii) = ^2 5(p,(a 1 ,a 2 ,v),q,w,(d,-i))* 

<?eQ, • 5(p',{a' 1 ,a^,v , ),q,w , ,(d!,l))-[d!-d = A]. 

d,d'e{-i,o,i} 2 

Now we can state and prove our result. 
Theorem 5.2: 

(i) Each nonuniform RTM that runs in space S at least logarithmic in the input length and 
time T can be simulated by a unidirectional nonuniform RTM running in time poly(5, T) 
and space O(S). 

(ii) Let e > and T: IM — > No - For each nonuniform QTM M running in space S at 
least logarithmic in the input length, there is a unidirectional nonuniform QTM that 
simulates M for T steps in poly (2°( s \ T, log (1/e)) steps with accuracy e using space 
0(5 + loglog(T/e)). 

Proof. In the main part of the proof, we deal with part (ii). We handle necessary changes for 
part (i) and RTMs at the end. We first describe how we encode the information about the 
simulated machine on the work tape of the simulating machine. Then we present a high-level 
algorithm carrying out a whole simulation step and define a unitary transformation realizing a 
single transition of the simulated machine. Afterwards, this unitary transformation is imple- 
mented approximately by the simulating unidirectional nonuniform QTM. 

Storage layout on the work tape. Let M = (Q,S,<5) be a nonuniform QTM that is to be 
simulated unidirectionally. We regard the advice tape simply as an additional read-only input 
tape. We assume that for input length n and space bound S > logn the heads on the input 
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tapes i G {1, 2} of M only reach the positions 0, . . . , rij — 1, where n± = n + 2, 712 = poly(n), 
and that the work tape head only reaches the positions 0, ... ,113 — 1 with 713 = S + 2 (this may 
be achieved using end markers). We assume that {0, 1, 2} C S. 

Let £ = 4 + 64 + 1 with £ x = \\o<g\Q\~\ and £ 2 = max{ flog n*] | i G {1,2,3}} = 0{S), and 
assume w. 1. o. g. that £ > 3. The information about the simulated machine is stored on two 
tracks of the work tape of the simulating machine as shown below. 



Track 1: 
Track 2: 
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! i 


! ! 

q 


1 1 

1 




1 1 1 


1 1 ! 


l 
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w 2 


w 3 
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i+1 










i+e-2 



Track 2 contains the work tape of the simulated machine. In I consecutive cells on track 1, which 
are called the info block, we encode all administrative information for the simulation. The posi- 
tion of the info block is used to indicate the position of the head on the work tape in a classical 
configuration. If the cells of the info block are located at positions i — + 1, . . . ,i + £ — 2 
on the work tape as shown in the figure, we say that the info block is at position i. In this 
situation, the inscription in the info block together with the symbols w\ , W2 , W3 G S in cells 
i — 1, i, i + 1 on track 2 are called the info window induced by the info block. 

The information stored in the info block consists of the local state q G Q of the simulated 
machine encoded in binary, a flag tp G {0, 1} showing whether the actual transition step has 
already been carried out, and vectors £ = (£1, £2, £3), v = (^1, ^2, ^3) in {0, . . . , rti — 1} x • • • x 
{0, . . . , 713 — 1} encoded in binary. The coordinates of £ are the positions of the tape heads of 
the simulated machine. Similarly, z^i and 1^2 are the positions of the heads on the input tapes 
of the simulating machine. Finally, ^3 is the position of the info block. We write the contents 
of the info window shown above as (q, tp, £, u, w), where w = (wi,W2,W3). 

Carrying out a simulation step. We first give an outline of our approach. For the simulation 
of a single step of M, we let the input tape heads of the simulating machine as well as the info 
block on the work tape successively move to all combinations of positions in {0, . . . ,n± — 1} x 
• • • x {0, . . . , 713 — 1} on the tapes that may be accessed. If during this sweep the machine reaches 
a configuration where the positions of the heads of the input tapes as well as the position of the 
info block, which are encoded in u, all agree with the stored positions of those of the simulated 
machine and ip = 0, then a local transition of the simulated machine is applied, for which we 
update the contents of the info window and set <p = \. After the sweep through all positions is 
complete, the flag ip is negated. 

In Figure this is described in more detail as a high-level algorithm. We use the following nota- 
tion. For x = (zi, X2, xz) G {0, . . . , n\ — 1} X ■ ■ ■ X {0, . . . ,n$ — 1}, let \x\ = £3712711 + X2n\ + x\. 
Furthermore, let \q, ip, £, v, w\ W2 W3) denote an ON-basis indexed by the different possible clas- 
sical inscriptions of the info window. 

Realizing a Transition Unitarily. Next we show that step 2 of the high-level algorithm can 
be described by a unitary transformation. For this, let the heads on the input tapes of the 
simulating machine as well as the info block on the work tape be at fixed positions. Let 01, 02 G S 
be the symbols under the input tape heads. Our goal is to specify a unitary transformation 
L^trans = tArans (^l , 02) that changes the contents of the info window according to the high-level 
algorithm. Using an idea due to Yao jlSj, we only carry out the identity in step 2.2 for those 
inscriptions of the info window that can actually arise during the computation at this point. 
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Loop with starting/stopping condition v = (0,0,0): 

1. Move the real input tape heads and the info block on the work tape to the positions 
in v. 

2. Transition: Let (p, (p, £,u, (wi,w 2 ,w 3 )) be the contents of the current info window and 
let a±,a2 G X be the symbols under the input tape heads. 

2.1. If £ = v and <p = 0, replace the contents of the info window with the superposition 

^2 5(p, (ai,a 2 ,w 2 ),q,b,d)\q,l,t + d,£,w 1 bw 3 ). 

de{-i,o,i} 3 

2.2. For all inscriptions of the info window that do not satisfy the condition of step 2.1 
and can actually arise during the computation, do nothing. 

3. Move real input tape heads and the info block on the work tape to positions (0, 0, 0). 

4. Update v to a new vector v' such that \v'\ = + 1) mod n\ ■ n 2 ■ n 3 . 
Set if = 1 — ip. End of simulation step. 



Figure 4: High-level description of the simulation step. 



\ v pI Wi ,w 2 ,w 3 ) = \P, °) £> £. w i w 2 w 3 ) 
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Figure 5: Vectors for the definition of C/ trans . 

This is required to allow the transformations of steps 2.1 and 2.2 to be combined to a unitary 
one. 

For a precise definition of C/trans, we introduce the collections of vectors in Figure El For these 
definitions, let p G Q, £ = (£i,£ 2 , £3),^ = fa,V2,vz) G {0, . . . , m — 1} x ■ ■ ■ x {0, . . . , 713 — 1}, and 
w, b, wi,w 2 ,w 3 G S. The summations are over all q G Q, b G X, and d = (d±, d 2 ,d 3 ) G { — 1, 0, l} 3 
if not indicated otherwise. Let Vi be the set of vectors with upper index i G {1, . . . , 5}. 

We require that the transformation C/trans satisfies 



a \v {1) ) = \v {2) ) 

trans 1 p£ ) wi,W2,W3 1 I p,£,wi,W2,W3 1 



for all p, £, and wi,w 2 ,w 3 and that C/transI^} = |^) for & U \v) G V3UV4U V5. The following claim 
implies that the above requirements can be satisfied by a unitary operator C/trans 1 completing 
this part of the proof. 
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Claim. The sets V\, Vi, and V3 U V4 U V5 are mutually orthogonal and the vectors in V 2 form 
an ON -basis. 

Proof of the claim. The claim follows from the fact that M is a legal QTM and thus has a 
unitary time evolution operator. We use the notion "superposition of M" to describe a unit 
vector from the Hilbert space spanned by the classical configurations of M as an ON-basis. 

The vectors in V 2 form an ON-basis: We regard the vectors in V\ and V2 as unique descriptions 
of superpositions of M. This is possible since the contents of the work tape of M that is outside 
the three symbols in the info window is fixed. Each vector \v^\ m W2 W3 ) uniquely describes 

the image of the classical configuration described by | vjfy m W2 Wa ) under the time evolution 
operator of M. Since this time evolution operator is unitary and the vectors in V\ obviously 
form an ON-basis, the vectors from V2 also form an ON-basis. 

The vectors in V\, V2, V3 U V4 U V5 are mutually orthogonal: We write M1-LM2 for two sets 
of vectors M\ and M2 if (v \ w) = for all v £ Mi and w € M2 and prove the statement by 
considering all possible pairs of sets in the list. 

V1-LV2, V1-LV3 U V4 U V§, V2-LV3: This follows immediately, since either the component for the 
flag ip or that for the position vector v distinguishes vectors from the considered sets. 

V9-LV4: We consider any pair of vectors |i>i 2 ]„„ „„ „„ ) and \v^\, , , , , ,). We may assume 

that w'3 = W3, v[ = £j for i £ {1,2} and ^3 = ^3 — 1 since otherwise the inner product of these 
vectors is obviously zero. By keeping only the summands in the inner product for which the 
basis vectors meet, we get 

/ (2) I (4) \ 

^2 6(p,(a 1 ,a2,W2),q,w' 2 ,d)* 
<7eQ,d,d'e{-i,o,i} 3 , • S(p', (a[,a' 2 ,w'), q,wi, d!) ■ [d'-d = 

with 4g{0,1} 

For the d, d! over which the summation is done, it is required that d' 3 — d^ = £3 — £ 3 = 1, i. e., 
ds = d's — 1. The sum may thus be rewritten as 

^2 S {P' (ai,a 2 ,w 2 ),q,w' 2 , (d,d" - 1))* 
q eQ,d"e{o,i], • 8(p',(a' l ,a' 2 ,w'),q,w 1 ,(d',d")) ■ [d' - d = (^i, £2) - (& Q] ■ 

d,d' G{-1,0,1} 2 

For A = (£1,^2) — (^"^ , ^"2 ) ' Lemma |5.1f i) implies that the sum takes the value 0. Thus the 
considered vectors are orthogonal. 

V2-LV5: This case is handled similarly to the latter one now using part (ii) of Lemma 15. II □ 

Constructing the Simulating QTM. We now describe how the QTM simulating the given 
QTM M unidirectionally is constructed. This simulating QTM carries out an endless loop 
executing single simulation steps until the simulated machine terminates, similar to the ma- 
chine constructed for part (ii) of Theorem 14.21 It is initialized as follows. 

- The info block belonging to the initial configuration of M is located at position of track 1 
of the work tape. The complete contents of the respective info window is then (go, 0, £, u, w), 
where qo is the initial state of M, £ = v = (0, 0, 0), and w only consists of blanks. 

- All input tape heads of the simulating machine are at position 0. 
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As in the last section, this initialization is realized by choosing the encoding for the information 
on the work tape such that the blank tape is consistent with the above requirements. 

We realize the high-level algorithm by first constructing a unidirectional RTM for everything 
except for step 2, for which the RTM has a special state as a placeholder. This is easy by 
putting together machines for basic tasks using appropriate versions of the lemmas of Bernstein 
and Vazirani [TJ, as in the last section. Afterwards, we insert a QTM for carrying out step 2 
which has still to be constructed. We ensure that the running time of this QTM is independent 
of the inscriptions of the info window. Then the complete QTM for the high-level algorithm 
obtained by the insertion has a running time independent of the contents of the different tapes. 

The transformation [/trans operates on a Hilbert space of dimension 0(£) = O(S). The num- 
ber of iterations of the loop is n\n2n^ = poly(n)S'. Reusing the calculations in the proof of 
Theorem I4.2f ii). it follows that a description of [/trans w ith accuracy e' = e / {2n\n2n^T 2 ) by 
elementary matrices adds 0(S + log log(T/e)) to the total space complexity if it is stored on 
the advice-tape. This is within the required bound for part (ii) of the theorem. The chosen 
accuracy e' is sufficient to carry out the T simulation steps with accuracy e. This corresponds 
to n\n2n^T executions of [/trans- The transformation ?7 trans is realized by carrying out the 
respective elementary transformations as described in the last section, using Lemma \'A. 51 

Resources. The running time for carrying out [/trans is dominated by the length of its description 
on the advice tape and can be estimated by 2°^ log(T/e). The number of iterations of the 
loop is poly (n)S. Thus the total time required for one simulation step can be estimated by 
0(poly(?i)2°( s ) log(T/e)) = poly(2°( s ), log(T/e)). 

Correctness. We show that each single computation step is performed correctly. We first 
consider step 2.1 of the high-level algorithm and the case that the condition in this step is 
met. We assume that the current configuration of the simulating machine is consistent with 
our described invariants, that track 2 and the info block contain classical inscriptions, and that 
the latter is at a fixed position. Then it is easy to see that [/trans correctly realizes a single 
transition of M. 

It remains to check that step 2.2 does not change anything. We observe that before the tran- 
sition of M has been carried out in step 2.1, [/t ran s performs the identity in step 2.2, since 
all encountered info window inscriptions correspond to vectors from V3. Immediately after the 
transition, the info window operated upon contains a vector \v) G V%. If after one or two shifts 
of the info window to the right on the work tape we adapt \v) by inserting the new this yields 
a vector from V4 or V5, resp., on which [/trans also performs the identity. If the window is shifted 
further to the right, the distance of the info window from the stored position of the work tape 
head in each classical inscription contained in the current superposition is at least two. Then 
the vector obtained by adapting \v) as described belongs to V3 and [/trans also performs the 
identity. Hence, [/trans behaves as desired. Altogether, we have completed the proof of part (ii). 

Simulation of RTMs. We can use the same construction as above, but replace the implemen- 
tation of [/trans- In this case, [/trans is just a permutation of inscriptions of the info window. 
This permutation can be computed exactly by a reversible circuit of size poly(^) consisting 
only of Toffoli gates. The description of this circuit on the advice tape adds an amount of 
0(log£) = O(logS') to the space complexity and its simulation takes time poly(^) = poly(5), 
which yields an overall bound on the time of poly (5, T). Hence, also part (i) follows. □ 

Since the simulation of QTMs in Theorem 15.21 is done only approximately and the space 
0{S + loglog(T/e)) needed for the simulation increases with the running time we again obtain 
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the question in which cases we can bound the running time without restricting the computa- 
tional power of the model. Here we need a statement for bounding the error probability away 
from 1/2 in the case of unbounded error and a construction of a probabilistic clock for QTMs. 

Lemma 5.3: For each nonuniform QTM M with algebraic amplitudes and running in space 
S(n) there exists a polynomial q such that for each input a £ {0, l} n , PM,i(a) > 1/2 implies 
PM,i(a) > l/2 + 2-"( 2S(n) ) andp M ,i{a) < 1/2 implies p M ,i(a) < 1/2 - 2^( 2S(n) ). 

Lemma 5.4: For each nonuniform QTM M with bounded or unbounded error, algebraic am- 
plitudes and running in space S(n), there is a QTM for the same function with algebraic am- 
plitudes, the same type of error, the space bound 0(S(n)) and expected running time 2 2 ° (S(n)) . 

Lemma 15.31 is proved in the same way as Lemma 14. 1UI since the matrix describing the transition 
probabilities in the proof in the same way describes transition probabilities of nonuniform QTMs. 
For the proof of Lemma 15.41 we modify the given QTM M in a way similar to the construction 
of the QBP in the proof of Lemma 14.111 Using the proof of Lemma 4.6 in Watrous [3H it is 
easy to construct a QTM Mt that for an appropriate t = 2°^ stops with probability 2 - ®W 
and continues with probability 1 — 2~®W . Using suitable versions of the lemmas of Bernstein 
and Vazirani ^JJ for the construction of QTMs we modify M in such a way that, before 
each computation step, it additionally performs Mt. By a reasoning similar to the proof of 
Lemma 14.111 we obtain a QTM with the behavior claimed in Lemma 15.41 Using these results 
we easily obtain the following. 

Theorem 5.5: The space complexity of nonuniform QTMs with algebraic amplitudes and 
bounded or unbounded error is asymptotically equal to the space complexity of unidirectional 
nonuniform QTMs with the same kind of amplitudes and the same type of error, provided that 
these space complexities are at least logarithmic in the input length. 

Proof. Applying Lemmas 15.31 and 15.41 to a nonuniform QTM that according to the hypothesis 
runs in space S, we obtain a nonuniform QTM of the same kind running in expected time 2 2 ° tS) . 
Analogously to the proofs in the last section, using Markov's inequality to estimate the error of 
computations that take longer than time 2 2 ° S) , Theorem 15 . 21 yields a unidirectional nonuniform 
QTM of the desired kind running in space O(S). □ 

6. Quantum OBDDs 

Since for unrestricted branching programs no powerful lower bound methods are known, re- 
stricted variants of branching programs have been investigated in order to develop lower bound 
methods and to compare different modes of nondeterminism and randomization. A simple 
variant of branching programs closely related to the uniform model of DFAs and to one-way 
communication complexity are ordered binary decision diagrams (OBDDs). OBDDs are also 
used as a data structure for the representation and manipulation of boolean functions, see, e. g., 
Wegener (32] • Hence, it is natural to investigate also the quantum variant of OBDDs. 

Definition 6.1: A quantum OBDD (QOBDD) is a read-once QBP where on each path the 
variables are tested according to the same order. 

Below, we prove upper and lower bound results for QOBDDs. Before we do that, we discuss 
the definition of QOBDDs and their relationship to quantum finite automata. Furthermore, 
we define complexity classes in terms of the size of QOBDDs and compare them with the 
corresponding complexity classes for OBDDs. 
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Since on each path from the start node to a sink each variable is tested at most once, QOBDDs 
are always acyclic. Because of the definition of QBPs, also QOBDDs are unidirectional. Dif- 
ferent from Definition 16.11 Ablayev, Gainutdinova, and Karpinski require QOBDDs to be 
leveled such that there are edges only between adjacent levels. Proposition 12. lfll shows that this 
restriction is not crucial, because QOBDDs according to Definition 16. II can be transformed into 
leveled QOBDDs where the size increases by a factor of at most (n + l) 2 . 

Despite their superficial similarity, there are some important differences between QOBDDs and 
(1-way) quantum finite automata (QFAs). At the definition level, observe that, unlike QFAs, 
QOBDDs may read their input in an order different from x\, . . . , x n . Furthermore, they are a 
nonuniform model while QFAs are uniform. This implies two less obvious differences between 
QOBDDs and QFAs. In general, measuring whether the computation has stopped and, if yes, 
with which result, is allowed also during the computation of a QOBDD. The more restrictive 
definition that allows end nodes to be reached only after exactly n computation steps have been 
performed is equivalent to our definition because of Proposition 12.101 On the other hand, it 
is known that QFAs with and without such intermediate measurements are of different power 
(Kondacs and Watrous [201 )• Furthermore, one can decrease the error probability of a QOBDD 
with bounded error by probability amplification below any given constant, as for randomized 
OBDDs (see |32] for the randomized case). Again, this does not work for QFAs: Ambainis and 
Freivalds [7j have shown that the language {«}*{&}* can be recognized by QFAs with two-sided 
error 0.318, but not with error smaller than 2/9. 

For QOBDDs, we distinguish the same types of error as for general QBPs (see Definition 12. 5 j) . 
For characterizing the relative power of the resulting different types of QOBDDs, it is useful to 
define complexity classes with a naming convention analogous to that used for QTMs. The class 
of functions that can be computed exactly by polynomial size QOBDDs is called EQP-OBDD, 
and the class of functions with polynomial size zero error (bounded-error) QOBDDs is called 
ZQP-OBDD (BQP-OBDD). Similarly, the classes P-OBDD and BPP-OBDD of functions with 
polynomial size deterministic OBDDs and polynomial size randomized OBDDs with bounded 
error are defined. Furthermore, let Rev-OBDD denote the class of functions with polynomial size 
reversible OBDDs. The inclusions Rev-OBDD C EQP-OBDD C ZQP-OBDD C BQP-OBDD 
and Rev-OBDD C P-OBDD C BPP-OBDD immediately follow from the definitions. 

In this section we present simple, concrete example functions in order to prove that QOBDDs 
with bounded error and classical, deterministic OBDDs are incomparable in power, i.e., 
P-OBDD £ BQP-OBDD and BQP-OBDD % P-OBDD. We also present a partially defined 
function in order to show a similar result for QOBDDs and classical, randomized OBDDs for 
partial functions. Finally, we study the power of zero error and exact quantum computation 
for OBDDs. We prove that ZQP-OBDD C Rev-OBDD, i.e., zero error QOBDDs can at best 
be as good as reversible OBDDs. This implies that the three classes Rev-OBDD, EQP-OBDD, 
and ZQP-OBDD coincide and are strictly contained in P-OBDD. 

6.1. A Function with Small QOBDDs that Requires Large Deterministic OBDDs 

The permutation matrix test function PERM n is defined on n 2 boolean variables that are ar- 
ranged in a quadratic matrix. The function takes the value 1 iff each row and each column 
contains exactly one entry 1. It is well-known that PERM = (PERM n ) nG ^ does not have poly- 
nomial size read-once branching programs (Krause, Meinel and Waack |2^) and, therefore, no 
polynomial size OBDDs either. In [3S] (see also (121)) a polynomial size randomized OBDD 
with one-sided error for PERM has been designed using the so-called fingerprinting technique. 
We show here how this construction can be modified to work for QOBDDs. 
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Let X denote the input matrix and let xj = (xjo, ■ ■ ■ > x j,n— l) denote the jth row of X. Let 
\xj\ = ^2k x j,k^ k denote the value of the jth row interpreted as a binary number. The crucial 
observation is that 

n-l 

PERM n (X) = 1 <^ \xj\ — (2 n — 1) = OA each Xj contains exactly one entry 1. 

3=0 

The exact evaluation of the sum S = Y^j=o \ x j\ ~ (2 n — -0 requires OBDDs of exponential size. 
Hence, S is evaluated modulo a randomly chosen prime number p. It is straightforward to 
construct a reversible OBDD that evaluates S mod p and simultaneously checks that each 
xj contains exactly one entry 1. In the variables are tested in a rowwise order. For each 
row it has to be stored whether an entry 1 has already been found. If a second 1 is found in 
some row, a 0-sink is reached. Furthermore, in each level the OBDD stores the partial sum 
of the terms corresponding to the bits already read. Since the partial sums are only stored 
modulo p, this increases the width merely by a factor of p. Altogether, each level contains at 
most 2p interior nodes. Hence, the size of G^ is 0(pn 2 ). It only accepts if S mod p is equal to 
0. 

Now we construct a QOBDD G for PERM n . Let m = 2n 2 and let pi, ■ ■ ■ ,p m denote the m 
smallest primes. By the prime number theorem, p m = 0(m log m) = 0(n 2 \ogn). We construct 
G^\ . . . , G^ Pm ^ and combine these reversible OBDDs by a node labeled by the first variable 
with m outgoing c-edges with amplitudes 1/y/m leading to the c-successors of the start nodes 
of G^,. . . , G( pm \ This realizes a random choice between G( pi \ ... , G& m \ The size of G is 
bounded by 0(n 6 logn). 

We estimate the error probability. The sum S is bounded above by n1 n . Hence, if S is different 
from 0, it has at most n + log n prime factors. Thus the probability of randomly choosing a 
prime dividing S is bounded above by (n + logn)/(2n 2 ) < 1/n. This is also an upper bound on 
the error probability of G. The error is one-sided, i. e., if PERM n (X) = 1, then the QOBDD G 
always computes 1, while it may err if PERM n (X) = 0. The probability can even be made 
smaller than l/p(n) for any polynomial p by increasing the number of primes, which only 
increases the size of G polynomially. We have proved: 

Theorem 6.2: There are QOBDDs for -<PERM n with one-sided error 1/n and size 0(n 6 logn). 
Corollary 6.3: BQP-OBDD<£ P-OBDD. 

6.2. Functions with Small Deterministic OBDDs that Require Large QOBDDs 

The disjointness function and the inner product function are defined by DISJ n (xi, . . . ,x n ) = 
(x\ V X2) A (Zc3 V X4) A • • • A (x n _i V x n ) and IP(xi, . . . , x n ) = X\X2 ® • • • ® x n _ix n , where n 
is an even number. Both functions are extensively investigated in communication complexity, 
see, e.g., [221 - F° r the variable order x\, . . . ,x n they have OBDD size 0(n), since it suffices to 
store at most two bits at each level of the OBDD, namely, the value of the variable read in the 
last step and the value that the function takes on the variables up to the last variable with an 
even index. However, both functions are difficult for QOBDDs and, therefore, also for reversible 
OBDDs, since these OBDD models have difficulties in "forgetting" variables read. 

The lower bound proof uses some ideas due to Nayak [2Sj based on quantum information 
theory. We briefly introduce the required notions and facts. For a proper introduction to 
quantum information theory we refer to (2^]. Recall that a mixed state of a quantum system 
is a probability distribution of pure quantum states. A mixed state is usually described by 
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its density matrix, which is a positive matrix with unit trace. The density matrix for the 
probability distribution (pj, \<pi))i is a = YliPilfiiifil- ^ state resulting from the application of 
the unitary transformation U to the state described by the density matrix a is described by the 
density matrix UaW . Now assume that ( | -0* } ) « is an orthonormal basis of eigenvectors of a and 
that Aj is the eigenvalue belonging to Then the von Neumann entropy of a is defined as 
S(o~) = — Y2i 1°S ^i- The von Neumann entropy is invariant under unitary transformations U, 
i.e., S(UaU^) = S(a). Furthermore, if a is a density matrix over a (finite-dimensional) Hilbert 
space TC, then S(o~) < log (dim (?i)). Finally, we formally introduce the kind of measurements 
that are relevant here. 

Definition 6.4: Let J be a finite index set and let Ai = (Pj)jgj be a family of projection 
operators over the finite-dimensional Hilbert space Ti with X^e j P = I- Then call M a 
projective measurement over Ti. with results in J. For any density matrix a over 7i, define the 
probability of measuring result i G J in the state described by a by Pr{.M(o") = i} = tr(crPj). 

The following lemma is due to Nayak. In the lemma, H (p) denotes the binary entropy function 
defined by H(p) = —plogp — (1 — p) log(l — p). 

Lemma 6.5 ( |Z5| ) : Let o~q and o~\ be density matrices over the finite- dimensional Hilbert 
space 7i and let a = 1/2 • (o"o + 01). Suppose there is a projective measurement M. = (Pq,Pi) 
over 7i with results in {0,1} such that for b G {0,1}, Pr{_M((7b) = b} > p > 1/2. Then 
S(a) > (S(a ) + S{<T X ))/2 + (1 - H(p)). 

Now we are ready to prove the main result of this section, which is stated in the following 
theorem. The corollary directly follows from the upper bound on the OBDD size mentioned 
above. 

Theorem 6.6: The size of each QOBDD with bounded error for DISJ n or IP n is 
Corollary 6.7: P-OBDD^ BQP-OBDD. 

Proof of Theorem \6.f\ We only prove the statement for disjointness, the claim for the inner 
product follows in the same way. Let a QOBDD G with some variable order ir for DISJ n 
be given. W.l.o.g. let G be leveled. Due to the symmetry of the OR-function, we may assume 
w.l.o.g. that for each i G {1, . . . , n/2} the variable X2i-i is tested before X2i in it. Let p = 1/2 + e 
be a lower bound on the success probability of G. We generate random inputs x for DISJ n in 
the following way. Each variable with an odd index gets one of the values and 1 with a 
probability of 1/2 each. All variables with an even index get the value 0. Let o~(k) denote the 
density matrix describing the state of the QOBDD after reading the kth. variable with an odd 
index. By induction we prove S(o~(k)) > (1 — H(p))k. Since the state of the QOBDD before 
reading the first randomly chosen variable is a pure state, we have ^(^"(O)) = 0. Now let k > 1. 
By induction hypothesis S(a(k — 1)) > (1 — H(p))(k — 1). Let Xi be the kth. variable with an odd 
index. Let Uo and U± be the unitary transformations performed by the QOBDD while reading 
all the variables after the (k — l)-st variable with an odd index and up to Xi inclusively, where 
the latter gets the value or 1, resp. Since xi is chosen to be or 1 at random, 

ff(*0 = \ (U o-(k - 1)C/J + Uta(k - 1)E/}) . 
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Let U denote the composition of the unitary transformations performed by the QOBDD if the 
partner Xj+i of Xi gets the value 1 and all other variables read after Xi get the value 0. Then 
the function DISJ n attains the value c £ {0, 1} if x% = c. Let a = Ua(k)U^ . Since the QOBDD 
computes the function DISJ n , the measurement of the QOBDD on a yields the result c with 
a probability of at least p if Xi has the value c. By Lemma 16.51 and the invariance of the von 
Neumann entropy under unitary transformations, 

S(a(k)) = S(a) > ]^{s{UU Q a(k-l)ulu^) + S{UU l (T(k-l)ulu^ +1-H(p) 
> }- {S (a(k-l)) + S(a(k-l))) + l-H(p). 

Then the claim follows by the induction hypothesis. We obtain the lower bound (1 — H(p)) ■ n/2 
on the von Neumann entropy of the density matrix describing the state of G after reading all 
variables with odd indices. By the above remark, this implies the lower bound 2^ 1 ~ H ^" n ^ 2 on 
the dimension of the state space of G and, therefore, also on the size of G. □ 

6.3. A Partial Function with Small QOBDDs that Requires Large Randomized 
OBDDs 

An OBDD or QOBDD for a partially defined function has to compute the correct value of the 
function only on the domain of the function, while it may compute an arbitrary result on inputs 
outside the domain. We present a partially defined function with polynomial size QOBDDs but 
only exponential size randomized OBDDs. The idea behind the construction of the function is 
based on a result of Raz |3J for communication protocols. 

The function we consider gets unitary matrices as inputs. In order to obtain a finitely rep- 
resentable function, we redundantly encode sufficiently precise approximations of the desired 
matrices by boolean variables. The redundancy in the encoding will allow us to prove a lower 
bound for arbitrary variable orders. 

For the following, fix an even n £ hi and let e > 0. Let b = 6(n — 1) and let Wq, . . . , W&_i be 
some fixed enumeration of the matrices in Q n from Lemma lH.4l Let k — k(ri, — 0(ri? log(/2 / / s)) 
be the number from this lemma. For £ > k and m > b — 1 the universal (e, £, m)-code of n x n- 
matrices consists of the £(m + 1) boolean variables Xij, 1 < i < £, 1 < j < m + 1. For 1 < i < £ 
let Xi = (x it i, ... , £j, m+ i) and v(xi) = x itl H h x i>m . Let 

U - I ^ / («( a; i)H \-v(xi)) mod 6> ^ %i,m+l = lj 

1 I, if Xi,m+1 = 0. 

Then the variable vector x = (xi, . . . , xg) encodes the matrix 

W(x) = Ui-Ut-i U x . 

Note that the variables Xj )m+ i only switch between W( v f Xl -\^ \-v(x t )) modfe an d t ne identity ma- 
trix. In particular, they do not influence the sum (v(x\) + ■ ■ ■ + v(xi)) mod b. By Lemma 13.41 
for each unitary n x n-matrix U there is a setting to the x- variables such that \\U — W{x)\\ < e. 
In the following, £ is much larger than k such that there are many settings to obtain a certain 
unitary matrix in the product approximating U. 
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Now we define the considered function. Let |1), . . . , \n) be the standard basis of C n . Let Vq 
and V\ denote the subspaces spanned by the first and last n/2 of these basis vectors. Let 

< $ < l/\^2. The input for the function R-d^^ m ,n consists of 3£(m + 1) boolean variables 
a i,j>bi,j>Ci,ji 1 <«<^, 1 <i <m + l, which are interpreted as universal (e, £, m)-codes for three 
unitary n x ?i-matrices A, B, C, where e = l/(3n). The function takes the value z G {0, 1} if the 
Euclidean distance between CBA\\) and V z is at most i). Otherwise the function is undefined. 

We first prove the upper bound on the size of QOBDDs. 
Theorem 6.8: Let < $ < 1/V2. The function R s , 

,3k,9kb,n with an input size of N — 
81 k 2 b + 9 k = 0(n 5 log 2 n) has QOBDDs with error at most i? 2 and size 0(iV 9/5 / log 8/5 N) . 

Proof. Set £ = 3k and m = 9kb. We choose the variable order that starts with the a-variables 
ordered as o^i, . . , ai, m +i, 02,1, • • , «2,m+i 5 • • • , a>i,i, ■ ■ , Afterwards the 6-variables and 

then the c- variables are tested in analogous orders. We first describe a subgraph Ga of the 
QOBDD evaluating the a-variables. Analogous subgraphs Gb and Gc are constructed for the 
b- and c- variables, resp. 

The nodes of Ga are arranged in bn columns, which we label by (r, s) with < r < b — 1 and 

1 < s < n, and in levels 1, . . . ,£(m + 1) + 1. Let |r)|s)|i) be the vector from an orthonormal 
basis that corresponds to the node of the tth level in column (r, s). The nodes in each of the 
first £{m + 1) levels are labeled by the same a-variable according to the variable order. The 
last level consists of sinks. Let p £ {1, ...,£}. For j £ {1, . . . ,m}, the node labeled by a p j in 
column (r, s) is left by a single 0-edge with amplitude 1 leading to the node of the next level of 
the same column and a single 1-edge with amplitude 1 leading to the node of the next level in 
column ((r + 1) mod b, s). For a node labeled by a Ptm+ i in column (r, s), a single 0-edge with 
amplitude 1 leaving this node leads to the node in column (r, s) of the subsequent level. There 
are 1-edges connecting this node to the nodes of the subsequent level such that the mapping 
|r)|s)|t) 1 — ► |r)(W / r |s))|t + 1) is performed, where t = (p — l)(m + 1) + m + 1. 

It is easy to verify that the graph Ga constructed in this way is well-formed and unidirectional. 
We evaluate Ga according to the semantics of QBPs starting from a node on the first level 
in column (r, s), i.e., with the superposition |r)|s)|l). Then after reading the variable vectors 

ai, ... ,a p , where = (a^i, . . . ,aj jm+ i), we reach the superposition \r')(U p Ui\s))\t) with 

r' = (r + v(ai) + • • • + v(a p )) mod b and t = (p — l)(m + 1) + m + 2. 

The QOBDD for R#. 

£,m,n starts with Ga, where the node on the first level in column (0, 1) is 
chosen as the start node. Then the amplitude for reaching a node of the (£(m + 1) + l)-st level 
in column (r, s) of Ga is exactly the sth coordinate of A\l), if r is the sum modulo b of all 
a-variables, and otherwise. After reading the a-variables, the value of r is no longer needed; 
however, it cannot be erased in a QOBDD. Hence, for each possible value r we add a copy of a 
subgraph Gb processing the variables encoding B in the same way as described before for A. 
The sink in column (r, s) of the (£(m + 1) + l)-st level of the subgraph Ga for A is identified 
with the node (0, s) of the rth copy of the subgraph Gb for B. Altogether b copies of the 
subgraph Gb are sufficient. In the same way b 2 copies of a subgraph Gc for processing C are 
sufficient. In each copy of Gc, the sink in column (r, s) of the last level is a 0-sink if s < n/2, 
and a 1-sink otherwise. For each input, there is exactly one copy of Gc and exactly one r 
such that for all s the amplitude of the node in column (r, s) of the last level equals the sth 
coordinate of CBA\1). For all other copies of Gc and for all other r the amplitudes are 0. 

Let E z denote the projection to the subspace V z . If \y) = CBA\1) has distance at most •& 
from the subspace V z , we have $ 2 > \\\y) — ^|y)|| 2 = 1 — HE^y) || 2 . The equality follows by 
an easy calculation. Hence, the measurement on the level of the sinks leads to the result z 
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with probability ||£ z |y)|| 2 > 1 - ?? 2 . The size of the QOBDD is dominated by the b 2 copies 
of Go- Each of these copies has size 0{bnN). Hence, the size can be estimated by 0(6 3 niV) = 
<3(n 9 log 2 n) = 0(iV 9 / 5 / log 8/5 A). □ 

In order to prove the lower bound, we apply arguments from communication complexity (see, 
e. g., jl(i|!22j for an introduction). We first state a result of Raz who has proved a lower 
bound on the communication complexity for a different function . Using two rectangular 
reductions, which are defined below, we transfer this lower bound to a lower bound on the 
communication complexity of Rfl e,m,n f° r anv £ > k and m > b — 1. Finally, by a standard lower 
bound technique for randomized OBDDs, the lower bound on the communication complexity 
implies a lower bound on the size of randomized OBDDs. 

We define the function due to Raz by describing the corresponding communication problem. 

Let < ?? < l/y/2. The input of Alice consists of a unit vector x G R n and two orthogonal 
subspaces So an d S\ of IR n of dimension n/2 each. Bob gets an orthogonal real- valued n x re- 
matrix T as input. The output is c G {0, 1} if Tx has distance at most $ from S c , and 
arbitrary otherwise. We remark that the usual definition of communication complexity can 
easily be extended to the case of infinite input sets which is considered here. Raz has proved 
the following result. 

Theorem 6.9 (|31J): Let < $ < l/y/2. Each randomized communication protocol with 
bounded error for R® requires ^(n 1 / 2 ) bits of communication. 

We note that the considered communication problems are partially defined. On inputs for 
which such a problem is not defined, both outputs and 1 are allowed. A partially defined 
communication problem on input sets X and Y can also be described by a relation R C X x 
Y x {0, 1}, where (x,y,z) G R iff z is a valid output for (x,y). In particular, if the problem 
is undefined for (x,y), we have (x,y,0), (x,y, 1) G R. A rectangular reduction from R' C 
X' x Y' X {0, 1} to R C X X Y x {0, 1} consists of two mappings / : X' -> X and g : Y' -> Y 
such that (f(x),g(y),z) G R (x,y,z) G R' . It is easy to see that a lower bound on the 
communication complexity for R' implies the same lower bound for R if there is a rectangular 
reduction from R' to R. 

We observe that the problem R® can easily be reduced to the following infinite precision variant 
Rfl n of the considered problem Rd£, m ,n- The input of R'$ consists of unitary n x n-matrices 
A, B and C, where Alice gets A and C, and Bob gets B. Their task is to compute z G {0, 1} 
if the distance between CBA\1) and V z is bounded by i9. (Again, Vq = span{|l), . . . , |n/2)} 
and V\ = span{|n/2 + 1 ),..., |n)}.) Obviously, R% n is a special case of R'$ n - Instead of an 
orthogonal matrix T, a unitary matrix B is allowed. The vector x and the subspaces Vq and V\ 
are now encoded by the unitary matrices A and C. Hence, the lower bound from Theorem 16.91 
also holds for R'^ . The second rectangular reduction is given in the following lemma. 

Lemma 6.10: For all constants with <■!?'< <d < l/v2, for all £ > k and m > b — 1, 

and for sufficiently large n, R'$, n is reducible to R$ t £ im ,n- 

Proof. Let (A',B',C) be an arbitrary input for R'^, . We map this input to an input for 
R&lm,n consisting of the universal (e, £, m)-codes of unitary n x n-matrices A,B,C with 

\\A-A'\\<e, \\B-B'\\<e, and \\C-C'\\<e, 

where e = l/(3n). By Lemma 13.41 we can find such an input (A,B,C) for R§^ yrn , n - We show 
that this mapping is even a rectangular reduction. Let Eq and E± be the projections on the 
subspaces V and Vi, resp. Let \y) = CBA\l) and \y') = C'B'A'\l). 
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Let w. 1. o. g. be a solution of Rq t m ,n 

for the input (A,B,C). Then either |||y) - E Q \y)\\ < i9, 
i.e., the only valid output is 0, or \\\y) — Eo\y)\\ > i?A |||y) — > i.e., the outputs and 

1 are allowed. This is equivalent to |||y) — > We prove that is also a solution of the 

problem R'#, n for the input (A',B',C) by showing that \\\y') - £a|y')ll > 

By the choice of A, B and C and by Proposition 13.61 we obtain — |y)|| < 3e = 1/n. By 
the assumption, \\E \y)\\ = || \y) - E x \y) \\ > 0. Hence, \\\y) - E \y)\\ = (l - \\E \y) || 2 ) 1/2 < 

(l-^V2 and thug 

IW>|| < \\Ei(\i/)-\v))\\ + \\Ei\v)\\ < WW) - \y)\\ + \\\y) - E \y)\\ < i + (i-tf 2 ) 1/2 . 

This implies |||y') - £a|y')|| 2 = 1 - ll^i|y')l| 2 > ^ 2 - Since •&' < •& and both are 

constants, it follows that — > for sufficiently large n. Hence, is a solution of 

%, n for the input (A', B', C). □ 

Altogether we obtain a lower bound on the communication complexity of R$t mn for £ > k and 
m > b — 1. 

Corollary 6.11: Let < ?? < 1/V2, I > k, and m > b — 1. Each randomized communication 
protocol with bounded error for R#/ )m ,n where Alice has the matrices A and C and Bob the 
matrix B requires Q>(n l l 2 ) bits of communication. 

Now we can prove the second part of the main result of this section, the lower bound on the 
size of randomized OBDDs with bounded error. 

Theorem 6.12: Let < $ < l/y/2. Each randomized OBDD with bounded error for the 
function R&,z k ,vkb,n on N = 81k 2 b + 9k = 0(n 5 log 2 n) variables has size 2 n ( Nl/w / lo s 1/5 N ) . 

It remains open to find an example of a total function with polynomial size QOBDDs but only 
exponential size randomized OBDDs. Using the currently available techniques, this seems to 
be difficult since the known lower bound techniques for randomized OBDDs, which are based 
on randomized communication complexity, also work in the quantum case (see Klauck [19)). 

Proof of Theorem \6.12L Let G be a given randomized OBDD for R$£ t m,n with £ = 3k and 
m = 9kb and with an arbitrary variable order. In general, the variables encoding the matrices 
A, B, and C do not occur as contiguous groups in the variable order. Because of the redundancy 
of the encoding of the matrices we can construct a suborder where the variables of each of the 
encodings of A, B, and C are grouped together such that the corresponding subproblem of 
Rtf,e,m,n is stih hard. Then we can apply the above communication complexity lower bound. 
Let 7r denote the order of the variables a^j, bij, Cjj, 1 < i < £, 1 < j < m, in G. For A (and 
similarly B and C) call each set of variables an, . . . , aj jJ7l in its encoding a block. The variables 
Oi |TO +i, bi tiri+ i, and Cj /m+ i do not occur in any block or in it. 

Claim. There is a suborder ir' of it such that for each matrix of A, B and C there are exactly k 
consecutive blocks in ir' that each contain exactly b variables. 

Proof of the claim. Think of ir as a list of all variables (except Oj )m +i, 6j jm +i, and Cj )m +i) in the 
prescribed order. Observe that there are 9k blocks of m = 9kb variables each encoding some 
matrix from the set Q n . 



34 



We divide tt into 9k contiguous parts such that for each block there is a part that contains 
at least b of its variables and such that for different blocks there are different parts with this 
property. The first of these parts is chosen by searching for the first position in the variable 
order tt where for some block b variables have been tested (and hence for all other blocks less 
than b variables have been tested). Then this block is chosen and the other variables up to 
the chosen position are eliminated. Furthermore, all other variables of the chosen block are 
eliminated. An easy induction shows that this procedure can be iterated until 9k parts are 
chosen. Thus we are left with 9k smaller blocks with exactly b variables each and such that for 
each original block there is a smaller block in the list. 

We now use the same idea to partition the list of variable blocks obtained in the first step into 
three parts such that for each of the three matrices there is a part containing at least k of its 
blocks and such that for different matrices there are different parts with this property. Again we 
eliminate variables in order to ensure that for each matrix exactly k consecutive blocks remain 
in the variable order. In this way, we obtain a variable order tt' with the desired properties. □ 

We replace all eliminated variables with and remove the nodes labeled by these variables in the 
randomized OBDD and redirect incoming edges to the O-successor. Furthermore, if all variables 
Oj 5 i, . . . , di >m of a block are eliminated, we also replace aj )7n ,+i with and modify the randomized 
OBDD accordingly. The same is done for the eliminated blocks of b- and c- variables. This yields 
a randomized OBDD G' for R$kbn that is at most as large as G. 

We prove the desired lower bound for G' using the standard lower bound technique for random- 
ized OBDDs (see, e. g., [12]). Observe that the variable order tt' consists of three parts belonging 
to the different matrices A,B,C in some arbitrary order. Let C\ be the set of nodes which are 
reached by some path on which exactly the variables for the first matrix according to tt' have 
been tested, and let C2 be the set of nodes which are reached by some path on which exactly 
the variables in the first two matrices have been tested. The OBDD can be used to build a 
randomized one- or two-round communication protocol for R$,k,b,n where Alice has the variables 
for A and C and Bob the variables for B. The players jointly follow a computation path in the 
OBDD from the start node to a sink, using random bits for decisions at random nodes of the 
OBDD and communicating the numbers of nodes in the sets C\ and C2. The communication 
complexity of this protocol is bounded by [log |Ci|] + [log | C2 1 ~| < 2(log \G'\ + 1). Together with 
Corollary 16.111 this yields the claimed lower bound. □ 

6.4. Las Vegas QOBDDs Versus Reversible OBDDs 

The main result of this section is that ZQP-OBDD C Rev-OBDD. This means that even the 
zero-error QOBDD model with some failure probability is no more powerful with respect to 
polynomial size than reversible OBDDs. 

The essence of the proof is as follows. Given a reversible OBDD G and a Las Vegas QOBDD G' 
for the same function and with the same variable order, we show that G' induces collections 
of measurements, called measurement schemes here, that allow to distinguish the subfunctions 
represented at each of the levels of G. We further prove that for such a measurement scheme, the 
dimension of the underlying Hilbert space can be lower bounded in terms of the number of those 
subfunctions. Altogether, we obtain a lower bound on the size of the Las Vegas QOBDD G' in 
terms of the size of the reversible OBDD G. 

Definition 6.13: Let 7i be a finite-dimensional Hilbert space and let \v\) , . . . , \v m ) € Ti be 
different pure quantum states. Let X = {1, . . . , m} and Y = {1, . . . ,n}. Call an m x n-matrix 
A = (flij) with entries in {0,1,*} and projective measurements A4j = (Mjo,Mji,Mji) with 
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possible results {0, 1, ?}, where j = 1, . . . ,n, a measurement scheme for \vi), . . . , \v m ) with zero 
error and failure probability e, < e < 1, if 

(i) for all different i,j G X there is a k G Y such that aik,ajk G {0, 1} and ^ a^; 

(ii) for all i G X and j G Y, if = *, then = * for all j < k < n; and 

(hi) for all i G X and j G F, if G {0,1}, then Pr{A^j(|uj)) = a^} > 1 - e and 
Pr{^ i (|u i ))=-.a ij -} = 0. 

A measurement scheme allows us to distinguish any pair of vectors from . . . , \ v m ) G 7i by 
zero error measurements. Our aim is to prove a lower bound on the dimension of 7i in terms 
of m. For this, we use the following lemma due to Klauck which is a Las Vegas variant of 
Lemma 16.51 

Lemma 6.14 ([19J): Let o~o,o~i be density matrices over 7i and let < p < 1. Suppose that 
there is a projective measurement A4 = (Mo, Mi, M?) with possible results {0,1,?} such that 
Pr{M(a b ) = b} > 1 - e and Pr{_M(o- 6 ) = ^6} = for all b G {0, 1}. Let a = pa + (1 - p)a ± . 
Then S(a) > P S(a ) + (1 - p)S(a 1 ) + (1 - e)H(p). 

The following lemma extends a result of Klauck that gives a lower bound on the Las Vegas 
one-way quantum communication complexity in terms of deterministic one-way communica- 
tion complexity. The proof of Klauck provides the main idea of the proof of Lemma 16.151 for 
measurement schemes without "*"-entries. 

Lemma 6.15: Let \v\), . . . , \ v m ) G H be different pure quantum states. If there is a measure- 
ment scheme for \v±), . . . , \v m ) with zero error and failure probability e, then dim(?^) > m l ~ e . 

Proof. Let A be the m x n-matrix with entries from {0, 1, *}, and let M±, . . . ,M n be the pro- 
jective measurements in the given measurement scheme for \v\), . . . , \ v m ). Let X = {1, . . . , m} 
and Y = {1, . . . , n}. Call two rows of a A distinguishable if they differ in a column where both 
of them have boolean values. Thus the rows of A are pairwise distinguishable according to the 
hypothesis. 

In the following we inductively define a mixed state over 7i with large von Neumann entropy 
in order to obtain the lower bound on the dimension of TC. The mixed states that we consider 
are convex combinations of the pure states o~i = \vi){vi\, i = l,...,m. For any I C X, j G Y, 
and b G {0, 1} let Ij^ = {i G I | = b}. 

(i) For I Q X with 1 1\ > 2 and j G Y such that all rows in the submatrix / x {j, j + 1, . . . , n} 
of A are distinguishable, let a(I,j) = (\Ij,i\/\I\) ■ <r(I jtl ,j + 1) + (\Ij,o\/\I\) ■ v(Ij,o,j + !)• 

(ii) Let a({i},j) = o~i for i £ X and 1 < j < n + 1. 

If the rows in the submatrix / x {j,j + 1, . . . ,n} of A are distinguishable, by condition (ii) of 
Definition ^. 131 the jth column of the submatrix only contains the entries and 1: If it contained 
an entry "*", the whole row would consist of "*" and would thus not be distinguishable from 
the other rows. It follows that cr(X, 1) is well defined by a recursive application of the above 
definition, since (by induction), all rows in / are pairwise distinguishable as long as |/| > 2, 
in which case part (i) is applicable. After some applications of part (i), finally part (ii) is 
applicable. 

Claim. For each I C X and j G Y such that all rows in the submatrix I x {j,j + 1, . . . ,n} of 
A are distinguishable, S(a(I,j)) > (1 — e)log|/|. 
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By the claim S(a(X, 1)) > (1 - e)log?n and dim(H) > 2 s( - a ( x < 1 ^ > m 1 ' 6 , which implies 
Lemma 151 It remains to prove the claim by an induction on the definition of o~(L,j). 

Induction base (Part (ii) of the definition): Then S(a({i},j)) = for alH G X and 1 < j < n+1. 

Induction step (Part (i) of the definition): We consider cr(I,j) = p ■ a(Ij t o,j + 1) + 
(1 — p) ■ a(Ij : i,j + 1), where p = \Ij : o\/\I\. Observe that / = Ij^ U I^\ and that for b G {0, 1}, 
+ 1) = J2iei b Pi a i f° r suitable probabilities Pi, i £ Ij t b, with Yli&i b Pi = 1 (the latter 
can also be proved by an easy induction on the definition of the o~(I,j)). Thus, applying the 
measurement A4j to a(Ij t b,j + 1) yields 

Pr{M j ((r(I j , b ,j + l)) = b}>l-e and Pr{M j {a{I j , b ,j + l)) = ^b} = 0. 
By Lemma l6,14| this implies 

S(a(I,j)) > p ■ S(<r(I jfi ,j + 1)) + (1 -p) ■ S(a(I jtl ,j + 1)) + (1 - e)H{p). 

By the induction hypothesis, S{a{Ij^,j + 1)) > (1 — e) log \Ij,b\ for b £ {0, 1}. Thus, 

S(a(I,j)) > p(l-e)]og\I j>0 \ + (l-p)(l- 6)1^11^1 + (l-e)H(p) 
= (1 - e) (plog | J i|0 | + (1 - p) log | Ij, i| + £T(p)) . 

Using that p|J| = | -^j%o | and (1 — p)\I\ = we get 

S(a(I,j) > (l-e){plog(p\I\) + (l-p)log((l-p)\I\) + H(p)) 

= (1 - e)(plogp + (1 -p)log(l — p) + H(jp) + log|/|) = (1 — e) log 

as desired. This completes the proof of the claim and thus the proof of Lemma 16.151 □ 
Now we can state and prove the main result. 

Theorem 6.16: Let G be a minimum size, leveled, reversible ir-OBDD for f. Let G' be a 
leveled ir-QOBDD that computes f with zero error and failure probability e, < e < 1. For 
i = 1, . . . ,n+l, let Li and L^ be the sets of nodes on leveli inG andG' , resp. Then \L^\ > |Lj| 1_e 
for i = 1, . . . ,n + 1. In particular, \G'\ > | C | 1 ^ . 

Corollary 6.17: Rev-OBDD = EQP-OBDD = ZQP-OBDD. 

Proof of Theorem \6.1b\ W.l.o.g. let G = (V,E) and G' = (V',E') have the variable order 
xi, . . . ,x n . From G and G' we construct some set of vectors which are intermediate states 
of the computation of G' . We exploit the relation to G in order to construct a measurement 
scheme for these vectors such that the lower bound follows from Lemma 16.151 

W. 1. o. g. / depends on all variables. Let 5: V' xV' x{0,l} — > C denote the transition amplitudes 
of G'. Let TC be the Hilbert space spanned by an orthonormal basis (\v)) v& y whose elements 
are identified with the nodes of G' . Let s E V and s' G V be the start nodes of G and G', 
resp., and let F C V' be the set of sinks of G' . For a partial input assignment a to xi, . . . , Xi, 
let \(f(a)) £ TC be the superposition reached in G' by carrying out its computation on a. Let 
-Mgink = (M s i n k,0) -^sink.ii Msink,?) be the projective measurement of the output label at the sinks 
in G' . For b G {0, 1}, fix a unitary operator U b on TL such that U b \v) = X^gy S(v,w,b)\w) for 
all v G V — F. Such an operator exists due to the well-formedness of G' . 
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By the assumptions of the theorem, Lj is the set of all nodes of G reached by partial assignments 
to x%, . . . , Xi— i, for i — 1, . . . , 71+1. Observe that L\ — {s} and, since G is leveled and / depends 
on all variables, all nodes in Lj, 1 < % < n, are labeled by Xi. For a node v £ V, let denote 
the subfunction of / represented at v according to the usual semantics of deterministic OBDDs. 

We recursively construct mappings asnj for % = 1, . . . , n + 1 such that asn^ maps a node v G Li 
to a partial assignment to X\, . . . ,Xi-\ reaching that node from the start node of G. First, we 
choose asni(s) as the empty assignment. Next consider a level Li with i > 1. Let vi, . . . , be all 
nodes representing one of the subfunctions / su b represented at nodes in Li. Since G is reversible 
and of minimum size, there are a constant b G {0,1} and different nodes ui,...,U£ G Lj-i 
such that (/u,-)Ui_i=6 = /sub and there is a 6-edge from Uj to for j = 1, ...,£. Define 
asnj(uj) = (asnj_i(iij), 6) for j = 1, . . . , I. For i = 1, . . . ,n + 1, let Cj = {|c/?(asnj(t> ))) | t> £ Lj}. 

Claim. For each i = 1, . . . ,n + 1, there is a measurement scheme for Ci with zero error and 
failure probability e. 

By Lemma |6.15| the claim implies \L^\ > dim(span(Cj)) > jLjl 1 " 5 and thus the first part of 
the theorem. Since (xi + • • • + x^Y > x\ + • • • + x c k for all c > 1 and xi,...,Xk G Rq , also 
\G'\> \G\^ £ follows. 

We prove the claim by induction on i. For i = 1 and C\ = {|<^(asni(s)))} = {\s')} the empty 
measurement scheme has the required properties. 

Let i > 1 and Li = {v\, . . . , v m }. Let 1" = {y\, . . . , ?Mr}, -/V = 2 n ~* +1 , be the set of assignments 
to Xi,. . . ,x n . Define the m x iV-matrix A = (ajk) by setting ajk = fvjiuk) for 1 < j < m and 
1 < k < N. For k = 1, . . . , N let .Mfc = (Af^o, Mf-i, M^i) be the projective measurement with 
Mk, x = -Wsink.x where x G {0, 1, ?} and U yk is the unitary transformation carried out by G' 
for the partial input when started on a superposition of the basis vectors (|w))„eL' • 

Obviously, A is a boolean matrix where two rows j,j' £ {1, . . . ,m} differ iff the corresponding 
subfunctions f Vj and /„ , differ on an input from Y. Hence, for a each set of pairwise different 
rows of A chosen as representatives for the different subfunctions and vectors in Ci chosen ac- 
cordingly, the above definitions yield a measurement scheme due to the fact that G' computes / 
with zero error and failure probability e. Our goal is to extend the matrix A and the collection 
of measurements such that we obtain a measurement scheme for all vectors in Q. We remark 
that A does not have entries "*". 

Consider a subset of rows of A belonging to the same subfunction / su b and thus containing 
identical vectors. W. 1. o. g., let vi,...,vg be the respective nodes in Li representing / su b. Let 
u±, . . . ,ue £ Li-i and b G {0, 1} be as in the definition of the assignments asn^Uj) above. In 
particular, b is the same constant for u±, . . . , U£. Then Ub\ip(asiii_i(uj))) = |(/9(asnj_i(iij), b)) = 
| (/?(asnj(f j))}. By induction hypothesis, there is measurement scheme for Cj_i. Let D be the 
matrix of this measurement scheme, which is of size |Cj_i| x p for some p. Consider the sub- 
scheme for the vectors |9?(asnj_i(«j))), j = 1, . . . ,£, which we obtain from D by deleting the 
rows corresponding to the other vectors. Let this measurement scheme be described by the 
i x p-matrix B = (bjk) and the projective measurements Vk = (Pk,o, Pk,i, Pk,?), k = 1, . . . ,p. 
Define V' k = (i* 0J P^, P' k ,), k = 1, ... J, by P' kx = P k , x u\ for x G {0, 1,'?}. ' 
Then for j G {1, ... ,£} and k G {1, . . . ,p} such that bjk G {0, 1}, 

Pr{^ (l^asn^-)))) = b jk } = ^^^(asn,^-))) || 2 = \\P k ,b jk ^(asn, (*;,-))> || 2 

= \\P kM ulu b \ V ( a sn l . 1 (u j )))\\ 2 
= Pr{7 3 fe (|^(asn i _i(uj)))) = b jk }. 
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Hence, the measurements V' k , k = 1, . . . ,p, satisfy property (iii) in the definition of measurement 
schemes with respect to the matrix B. 

Let B\ , . . . , Br be all submatrices of D obtained by the above construction for the different 
subfunctions of f v , v £ Li. Since in the construction of Bi,...,Br no columns of D are 
deleted, the columns of B\, . . . ,Br are labeled by the same measurements. Hence, we can 
attach the matrices B\, ... , Br to A as submatrices in the columns m + 1, . . . , m + p and 
fill up the remaining entries with "*" such that the new matrix A' obtained in this way and 
the measurements M.\ , . . . , M. n , V[ , . . . , VL comprise a measurement scheme for Cj with zero 
error and failure probability e. Since A does not have any "*"-entries, also property (ii) of 
Definition HH31 is fulfilled . □ 

The above lower bound on the size of zero error QOBDDs in terms of the size of reversible 
OBDDs is essentially optimal, as the following example shows. For n = 2^ define the index 
function IND n : {0, \} n+i — > {0, 1} on variable vectors x = (xo, . . . , x n -i) and y = (yo, . . . , ye-i) 
by IND n (x,y) = x\ y \, where \y\ = Y.t=l Vi 2 '- 

Proposition 6.18: For the variable order ir described by (xq, . . . ,x n -i,yo, . . . ,y^\), each de- 
terministic ir-OBDD representing IND n requires size 2 n , while the same function can be com- 
puted by zero error tt-QOBDDs with failure probability e of size 2( 1-£ ) n +°( lo s n ). 

Hromkovic and Schnitger have used a similar function to prove an analogous result for 
classical Las Vegas and deterministic one-way communication complexity and the special case 
of failure probability e = 1/2. The proof of the proposition is by a straightforward adaptation 
of a simple randomized OBDD to the quantum case. 

Proof. The lower bound for deterministic OBDDs is well known and follows from the fact that 
IND n has maximal one-way communication complexity with respect to the partition of variables 
where Alice obtains x and Bob obtains y. In the following, we briefly sketch the upper bound 
construction. 

For e > 1/2, partition x into k = [1/(1 — £)J blocks of size approximately (1 — e)n. The QOBDD 
chooses one of these blocks at random by an unlabeled node at the top (which can be removed 
later on similarly to the proof of Theorem 16.2(1 with outgoing edges having amplitudes l/y/k. 
These edges lead to sub-QOBDDs where the complete chosen block is read and stored, which 
requires a binary tree with 0(2^ 1_£ ) n ) nodes for each block. At each leaf of such a tree, append 
a tree of size 0(n) reading y and computing \y\. Finally, a sink with the correct output value 
is reached if \y\ lies in the chosen block, which happens with probability at least 1/k > 1 — e. 
Otherwise, the "?"-sink is reached. 

For e < 1/2, we select k = [1/e] blocks of ^-variables of size approximately (1 — s)n that cover 
each single variable exactly k — 1 times. The rest of the construction is the same as above. The 
failure probability is obviously bounded above by 1/k < e. □ 

6.5. Comparison of QOBDDs and Read-Once QBPs 

In this section we observe that, similarly to the classical case, QOBDDs are a more restricted 
model of QBPs than read-once QBPs. A function separating these two models with respect to 
polynomial size is the so-called indirect storage access function, which is defined in the following 
way. Let n = 2 k . The input of ISA n consists of the variables yo, . . . ,yk-i and xq, . . . ,x n -\. 
The y-variables are interpreted as a binary number s. The x-variables are partitioned into 
b = \n/k\ blocks of size k = logra, which are numbered beginning with 0. If s > b, the output 
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is 0. Otherwise the sth block is again interpreted as a binary number t and the output is xt- It 
is straightforward to construct a decision tree for ISA n of size 0(n 2 /logn), which can also be 
regarded as a read-once QBP. 

The lower bound for QOBDDs for all variable orders is a straightforward combination of two 
results. Klauck proved the lower bound Q(n) on the quantum one-way communication 
complexity of IND n , where Alice gets the x-variables and Bob the y- variables. This lower 
bound directly implies the lower bound 2^( n ) on the size QOBDDs for IND n , where the x- 
variables are tested before the y-variables. Using a rectangular reduction, it has been shown 
in [HI] that an OBDD for ISA n and an arbitrary variable order cannot be smaller than an OBDD 
for IND 

[n/lognj— l an d the variable order mentioned before. This also holds for QOBDDs such 
that we obtain the lower bound 2 f2 ( n / logn ) on the size for QOBDDs for ISA n and an arbitrary 
variable order. 

7. QBPs with Generalized Measurements 

The usual unitary quantum mode of computation has turned out to be only of limited use 
for such restricted models as quantum OBDDs and quantum finite automata. In this section 
we consider a generalization of QBPs where in each step the performed unitary operation is 
determined by the result of a previous measurement. We first present the definition of QBPs 
with generalized measurements and we discuss the relationship to QBPs and to randomized 
BPs. Afterwards, we prove a generic lower bound on the size of QOBDDs with generalized 
measurements for so-called fc-stable functions. 

Definition 7.1: Let k G IN with k > 3. A quantum branching program with generalized mea- 
surements (gmQBP) over the variable set X = {x±, . . . , x n } is a directed multigraph G = (V, E) 
with a start node s £ V, a set of sinks F Q V, and transition amplitudes 5. Nodes and edges 
are labeled in the same way as in a usual QBP (see Definition 12. 4|) . Additionally, there is a 
partition (Vo, Vi, V2, ■ ■ ■ , Vk-xj of V such that Vq and V\ consist of the 0- and 1-sinks of G, 
resp. The edge labels of the gmQBP G have to fulfill the following modified well-formedness 
constraint. Let u, v £ Vg, I £ {2, . . . , k — 1}, be interior nodes with var(u) = i and var(v) = j, 
resp. Then for all assignments a = (ax, . . . , a n ) to the variables in X, 

J> (wWwi ) = {J; ( w) 

w&V y 

Furthermore, gmQBPs are unidirectional, i. e., for each w £ V, all v £ V for which a b £ {0, 1} 
exists such that 5(v,w,b) 7^ are labeled by the same variable. 

We remark that the well-formedness condition for gmQBPs is weaker than the well-formedness 
condition for ordinary QBPs, because it has only to hold for pairs of nodes of the same set Vi. 

We now define the semantics of gmQBPs. As in the definition of usual QBPs, nodes corre- 
spond to vectors in an orthonormal basis {\v)) v <zy of TL = C'^' and intermediate results of the 
computation are superpositions of these vectors. As for QBPs, a computation step consists of 
a measurement and the subsequent transition to successor nodes according to the transition 
amplitudes 5. In a gmQBP, the measurement generalizes that allowed for QBPs as follows. 
The gmQBP performs the projective measurement A4 = (Pq, P\, P2, ■ ■ ■ , Pk—i) with results 
{0, 1,2... , k - 1}, where 

Pr = Yl MM re {0,1,2,...,*- 1}. 

VGVr 
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The probability of obtaining the result r is ||P r |i;}|| 2 . If the result r is or 1, the computation 
stops with output r. If r > 2, the computation continues with the normalized projection 



Then for each node v £ V r with var(v) = i the gmQBP follows the edges with boolean label at 
according to their amplitudes. This yields the new superposition 



The above definition does not allow "?" outputs for simplicity, since we do not consider 
Las Vegas gmQBPs, anyway. The modified well-formedness constraint implies that for each 
result of the measurement the corresponding mapping can be extended to a unitary transfor- 
mation. Computation time and acceptance modes are defined analogously to QBPs. Also the 
definition of QOBDDs with generalized measurements (gmQOBDDs) is straightforward: The 
variables are required to be tested according to a fixed variable order. We remark that gmQBPs 
have a simple graphic representation. Additionally to the representation of QBPs there is merely 
a partition of the nodes. 

The physical realizability of gmQBPs depends on the ability to perform measurements during 
a computation. Based on a standard argument using Neumark's theorem (see, e.g., |2"5|). 
such measurements can be described by unitary transformations in an extended Hilbert space. 
Furthermore, intermediate measurements are also possible, e. g., in the quantum circuit model 
defined in the textbook of Nielsen and Chuang as well as in the model of Aharonov, Kitaev 
and Nisan [1] which allows gates computing general quantum operations (superoperators). 

It is obvious that a QBP is a gmQBP with three possible measurement results. We show that 
randomized BPs can easily be transformed into gmQBPs. 

Proposition 7.2: For each randomized BP G computing some function f there is a gmQBP G' 
computing the same function with the same acceptance mode, and the size of G' is bounded above 
by the size of G. 

Proof. We remove all randomized nodes from G by allowing each node to have several outgoing 
0- and 1-edges labeled by appropriate probabilities. In the corresponding gmQBP there are the 
same edges, where the probability p is replaced with the amplitude ^fp. The partition of the 
node set consists of the set of 0-sinks, the set of 1-sinks and sets each containing exactly one 
interior node. An easy induction shows that for each input the acceptance probabilities of G 
and G' coincide. □ 

With the currently available techniques we cannot prove superpolynomial lower bounds for BPs 
and for QBPs either (cf. Proposition 12 . 7j) . Thus we are not able to prove that polynomial size 
gmQBPs are more powerful than polynomial size QBPs. However, for QOBDDs this is easy, 
even for k = 4, i.e., the smallest A; where gmQOBDDs are a generalization of QOBDDs. In 
Theorem Ifi . HI we have proved exponential lower bounds on the size of QOBDDs for the function 
DISJ and IP. On the other hand, it is easy to construct linear size deterministic OBDDs for 
DISJ and IP. A careful inspection shows that each node of these OBDDs has at most two 
incoming 0-edges and at most one incoming 1-edge. We partition the internal nodes into two 
sets V2 and V3 such that each pair of nodes with the same 0-successor is not in the same set. 
Furthermore, by duplicating the sinks we ensure that each sink has at most one predecessor. 
The sets Vo and V\ are the sets of 0- and 1-sinks, resp., that are obtained in this way. We obtain 
the following result. 





veV r 
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Proposition 7.3: There are gmQOBDDs of linear size with k = 4 possible measurement results 
that exactly compute DISJ n and IP n . 

Finally, we prove a generic lower bound on the size of gmQOBDDs for /c-stable functions. A 
function /: {0, l} n — ► {0, 1} is called k-stable if for each set V of variables of size k and each 
variable x% G V there is a setting of the variables outside V such that the resulting subfunction 
is Xi or X{. It is well known that /c-stable functions only have read-once branching programs 
of size 2 k ~ 1 , and it has been shown in [31] that also randomized OBDDs require size 2 n ( k \ 
Examples for such functions include the determinant of an n x n-matrix over Z2, which is 
(n — l)-stable, and the function checking whether a graph on n vertices has an n/2-clique, 
which is (re/4 + l)-stable. For these and other examples, see Wegener |42| . 

We remark that the state of a gmQOBDD after performing a measurement during a computation 
can be described as a mixed state, i. e., a probability distribution over pure states. Now we can 
apply a lower bound on the quantum communication complexity for the index function (defined 
at the end of Section lfi.4|) due to Klauck • 

Theorem 7.4: Each gmQOBDD with bounded error for a k-stable functions has size 2^ k \ 

Proof. W. 1. o. g. let k = 2 . Klauck JS] has observed that the quantum one-way communication 
complexity of the function IND/t is lower bounded by Q,{k) for the partition where the first player 
Alice gets the input vector x = (xq, . . . , x^-i) and the second player Bob gets y = (yo, • • • , Vi-i)- 
This lower bound also holds for the two-sided error model and if Alice may send a mixed state 
to Bob. Let a gmQOBDD for a /c-stable function / be given. Then IND& can be computed 
by a quantum one-way protocol in the following way: Alice may choose the first k variables in 
the variable order and Bob the remaining variables. By the property of /c-stable functions, for 
each of Alice's variables, Bob can fix his variables such that the gmQOBDD outputs the value 
of the variable or its complement. Hence, it suffices for Alice to perform the computation of 
the gmQOBDDs of the first k levels for the given setting of her x- variables and to send the 
(mixed) state of the gmQOBDD after her computation to Bob. Bob can then compute the 
output as described. The communication complexity is bounded above by the logarithm of the 
size (or even the width) of the gmQOBDD. Together with the lower bound on the quantum 
communication complexity for IND^, the theorem follows. □ 

8. Open Problems 

In this paper, we have explored the foundations of space-bounded nonuniform quantum com- 
plexity to some extent, but several interesting problems nevertheless remain open. 

- It is not clear whether algebraic amplitudes for nonuniform QTMs and short amplitudes for 
QBPs are the most general reasonable sets of amplitudes. Is it possible to provide some formal 
argument that excludes more general sets of amplitudes (as done by Adleman, DeMarrais, 
and Huang [3j for the uniform case and arbitrary complex amplitudes)? 

- For space-bounded nonuniform QTMs with algebraic amplitudes we have proved that the 
general model can be simulated by the unidirectional one. It is open so far whether an analo- 
gous simulation also exists for the uniform case. Furthermore, for QBPs it is straightforward 
to define a variant without the requirement of unidirectionality. Can this generalized model 
be simulated by the unidirectional model or is it unreasonably powerful? 
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- It remains open whether there is a space-efficient simulation of QBPs by nonuniform QTMs for 
the cases of error-free and exact quantum computation and, if not, to provide some evidence 
showing that such a simulation is unlikely to exist. 

- With respect to the comparison of OBDDs and QOBDDs, the relationship between the classes 
BQP-OBDD and BPP-OBDD for total functions is left open. 

- Prove lower bounds for more general variants of QBPs. While lower bounds for QOBDDs 
can be obtained using tools from quantum communication complexity, already the proof of 
lower bounds for (possibly unordered) read-once QBPs seems to require new arguments. 

- The model of gmQBPs remains largely open to investigation. In particular, the relationship 
between the standard model of QBPs and gmQBPs needs to be further clarified. Show 
separation results as that for QOBDDs and gmQOBDDs presented here also for more general 
variants of QBPs or investigate simulations of gmQBPs by usual QBPs. 
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